cat /usr/spool/public/unix_stuff/csh.doc An introduction to the C shell William Joy Computer Science Division Department of Electrical Engineering and Computer Science University of California, Berkeley Berkeley, California 94720 _A_B_S_T_R_A_C_T _C_s_h is a new command language interpreter for UNIX* systems. It incorporates good features of other shells and a _h_i_s_t_o_r_y mechanism similar to the _r_e_d_o of INTERLISP. While incorporating many features of other shells which make writing shell programs (shell scripts) easier, most of the features unique to _c_s_h are designed more for the interactive UNIX user. UNIX users who have read a general introduc- tion to the system will find a valuable basic explanation of the shell here. Simple terminal interaction with _c_s_h is possible after reading just the first section of this document. The second section describes the shells capabilities which you can explore after you have begun to become acquainted with the shell. Later sections introduce features which are useful, but not necessary for all users of the shell. Back matter includes an appendix listing spe- cial characters of the shell and a glossary of terms and commands introduced in this manual. March 10, 1984 __________________________ *UNIX is a Trademark of Bell Laboratories. An introduction to the C shell William Joy Computer Science Division Department of Electrical Engineering and Computer Science University of California, Berkeley Berkeley, California 94720 _I_n_t_r_o_d_u_c_t_i_o_n A _s_h_e_l_l is a command language interpreter. _C_s_h is the name of one particular command interpreter on UNIX. The primary purpose of _c_s_h is to translate command lines typed at a terminal into system actions, such as invocation of other programs. _C_s_h is a user program just like any you might write. Hopefully, _c_s_h will be a very useful program for you in interacting with the UNIX system. In addition to this document, you will want to refer to a copy of the UNIX programmer's manual. The _c_s_h documenta- tion in the manual provides a full description of all features of the shell and is a final reference for questions about the shell. Many words in this document are shown in _i_t_a_l_i_c_s. These are important words; names of commands, and words which have special meaning in discussing the shell and UNIX. Many of the words are defined in a glossary at the end of this document. If you don't know what is meant by a word, you should look for it in the glossary. _A_c_k_n_o_w_l_e_d_g_e_m_e_n_t_s Numerous people have provided good input about previous versions of _c_s_h and aided in its debugging and in the debug- ging of its documentation. I would especially like to thank Michael Ubell who made the crucial observation that history commands could be done well over the word structure of input text, and implemented a prototype history mechanism in an older version of the shell. Eric Allman has also provided a large number of useful comments on the shell, helping to unify those concepts which are present and to identify and eliminate useless and marginally useful features. Mike O'Brien suggested the pathname hashing mechanism which speeds command execution. Jim Kulp added the job control and directory stack primitives and added their documentation to this introduction. - 2 - _1. _T_e_r_m_i_n_a_l _u_s_a_g_e _o_f _t_h_e _s_h_e_l_l _1._1. _T_h_e _b_a_s_i_c _n_o_t_i_o_n _o_f _c_o_m_m_a_n_d_s A _s_h_e_l_l in UNIX acts mostly as a medium through which other _p_r_o_g_r_a_m_s are invoked. While it has a set of _b_u_i_l_t_i_n functions which it performs directly, most commands cause execution of programs that are, in fact, external to the shell. The shell is thus distinguished from the command interpreters of other systems both by the fact that it is just a user program, and by the fact that it is used almost exclusively as a mechanism for invoking other programs. _C_o_m_m_a_n_d_s in the UNIX system consist of a list of strings or _w_o_r_d_s interpreted as a _c_o_m_m_a_n_d _n_a_m_e followed by _a_r_g_u_m_e_n_t_s. Thus the command mail bill consists of two words. The first word _m_a_i_l names the com- mand to be executed, in this case the mail program which sends messages to other users. The shell uses the name of the command in attempting to execute it for you. It will look in a number of _d_i_r_e_c_t_o_r_i_e_s for a file with the name _m_a_i_l which is expected to contain the mail program. The rest of the words of the command are given as _a_r_g_u_- _m_e_n_t_s to the command itself when it is executed. In this case we specified also the argument _b_i_l_l which is inter- preted by the _m_a_i_l program to be the name of a user to whom mail is to be sent. In normal terminal usage we might use the _m_a_i_l command as follows. % mail bill I have a question about the csh documentation. My document seems to be missing page 5. Does a page five exist? Bill EOT % Here we typed a message to send to _b_i_l_l and ended this message with a |^D which sent an end-of-file to the mail pro- gram. (Here and throughout this document, the notation ``|^_x'' is to be read ``control-_x'' and represents the strik- ing of the _x key while the control key is held down.) The mail program then echoed the characters `EOT' and transmit- ted our message. The characters `% ' were printed before and after the mail command by the shell to indicate that input was needed. After typing the `% ' prompt the shell was reading com- mand input from our terminal. We typed a complete command - 3 - `mail bill'. The shell then executed the _m_a_i_l program with argument _b_i_l_l and went dormant waiting for it to complete. The mail program then read input from our terminal until we signalled an end-of-file via typing a |^D after which the shell noticed that mail had completed and signaled us that it was ready to read from the terminal again by printing another `% ' prompt. This is the essential pattern of all interaction with UNIX through the shell. A complete command is typed at the terminal, the shell executes the command and when this exe- cution completes, it prompts for a new command. If you run the editor for an hour, the shell will patiently wait for you to finish editing and obediently prompt you again when- ever you finish editing. An example of a useful command you can execute now is the _t_s_e_t command, which sets the default _e_r_a_s_e and _k_i_l_l characters on your terminal - the erase character erases the last character you typed and the kill character erases the entire line you have entered so far. By default, the erase character is `#' and the kill character is `@'. Most people who use CRT displays prefer to use the backspace (|^H) char- acter as their erase character since it is then easier to see what you have typed so far. You can make this be true by typing tset -e which tells the program _t_s_e_t to set the erase character, and its default setting for this character is a backspace. _1._2. _F_l_a_g _a_r_g_u_m_e_n_t_s A useful notion in UNIX is that of a _f_l_a_g argument. While many arguments to commands specify file names or user names some arguments rather specify an optional capability of the command which you wish to invoke. By convention, such arguments begin with the character `-' (hyphen). Thus the command ls will produce a list of the files in the current _w_o_r_k_i_n_g _d_i_r_e_c_t_o_r_y. The option -_s is the size option, and ls -s causes _l_s to also give, for each file the size of the file in blocks of 512 characters. The manual section for each command in the UNIX reference manual gives the available options for each command. The _l_s command has a large number of useful and interesting options. Most other commands have either no options or only one or two options. It is hard to - 4 - remember options of commands which are not used very fre- quently, so most UNIX utilities perform only one or two functions rather than having a large number of hard to remember options. _1._3. _O_u_t_p_u_t _t_o _f_i_l_e_s Commands that normally read input or write output on the terminal can also be executed with this input and/or output done to a file. Thus suppose we wish to save the current date in a file called `now'. The command date will print the current date on our terminal. This is because our terminal is the default _s_t_a_n_d_a_r_d _o_u_t_p_u_t for the date command and the date command prints the date on its standard output. The shell lets us _r_e_d_i_r_e_c_t the _s_t_a_n_d_a_r_d _o_u_t_p_u_t of a command through a notation using the _m_e_t_a_c_h_a_r_a_c_- _t_e_r `>' and the name of the file where output is to be placed. Thus the command date > now runs the _d_a_t_e command such that its standard output is the file `now' rather than the terminal. Thus this command places the current date and time into the file `now'. It is important to know that the _d_a_t_e command was unaware that its output was going to a file rather than to the terminal. The shell performed this _r_e_d_i_r_e_c_t_i_o_n before the command began executing. One other thing to note here is that the file `now' need not have existed before the _d_a_t_e command was executed; the shell would have created the file if it did not exist. And if the file did exist? If it had existed previously these previous contents would have been discarded! A shell option _n_o_c_l_o_b_b_e_r exists to prevent this from happening accidentally; it is discussed in section 2.2. The system normally keeps files which you create with `>' and all other files. Thus the default is for files to be permanent. If you wish to create a file which will be removed automatically, you can begin its name with a `#' character, this `scratch' character denotes the fact that the file will be a scratch file.* The system will remove __________________________ *Note that if your erase character is a `#', you will have to precede the `#' with a `\'. The fact that the `#' character is the old (pre-CRT) standard erase char- acter means that it seldom appears in a file name, and allows this convention to be used for scratch files. - 5 - such files after a couple of days, or sooner if file space becomes very tight. Thus, in running the _d_a_t_e command above, we don't really want to save the output forever, so we would more likely do date > #now _1._4. _M_e_t_a_c_h_a_r_a_c_t_e_r_s _i_n _t_h_e _s_h_e_l_l The shell has a large number of special characters (like `>') which indicate special functions. We say that these notations have _s_y_n_t_a_c_t_i_c and _s_e_m_a_n_t_i_c meaning to the shell. In general, most characters which are neither letters nor digits have special meaning to the shell. We shall shortly learn a means of _q_u_o_t_a_t_i_o_n which allows us to use _m_e_t_a_c_h_a_r_a_c_t_e_r_s without the shell treating them in any special way. Metacharacters normally have effect only when the shell is reading our input. We need not worry about placing shell metacharacters in a letter we are sending via _m_a_i_l, or when we are typing in text or data to some other program. Note that the shell is only reading input when it has prompted with `% '. _1._5. _I_n_p_u_t _f_r_o_m _f_i_l_e_s; _p_i_p_e_l_i_n_e_s We learned above how to _r_e_d_i_r_e_c_t the _s_t_a_n_d_a_r_d _o_u_t_p_u_t of a command to a file. It is also possible to redirect the _s_t_a_n_d_a_r_d _i_n_p_u_t of a command from a file. This is not often necessary since most commands will read from a file whose name is given as an argument. We can give the command sort < data to run the _s_o_r_t command with standard input, where the com- mand normally reads its input, from the file `data'. We would more likely say sort data letting the _s_o_r_t command open the file `data' for input itself since this is less to type. We should note that if we just typed sort __________________________ If you are using a CRT, your erase character should be a |^H, as we demonstrated in section 1.1 how this could be set up. - 6 - then the sort program would sort lines from its _s_t_a_n_d_a_r_d _i_n_p_u_t. Since we did not _r_e_d_i_r_e_c_t the standard input, it would sort lines as we typed them on the terminal until we typed a |^D to indicate an end-of-file. A most useful capability is the ability to combine the standard output of one command with the standard input of another, i.e. to run the commands in a sequence known as a _p_i_p_e_l_i_n_e. For instance the command ls -s normally produces a list of the files in our directory with the size of each in blocks of 512 characters. If we are interested in learning which of our files is largest we may wish to have this sorted by size rather than by name, which is the default way in which _l_s sorts. We could look at the many options of _l_s to see if there was an option to do this but would eventually discover that there is not. Instead we can use a couple of simple options of the _s_o_r_t command, com- bining it with _l_s to get what we want. The -_n option of sort specifies a numeric sort rather than an alphabetic sort. Thus ls -s | sort -n specifies that the output of the _l_s command run with the option -_s is to be _p_i_p_e_d to the command _s_o_r_t run with the numeric sort option. This would give us a sorted list of our files by size, but with the smallest first. We could then use the -_r reverse sort option and the _h_e_a_d command in combination with the previous command doing ls -s | sort -n -r | head -5 Here we have taken a list of our files sorted alphabeti- cally, each with the size in blocks. We have run this to the standard input of the _s_o_r_t command asking it to sort numerically in reverse order (largest first). This output has then been run into the command _h_e_a_d which gives us the first few lines. In this case we have asked _h_e_a_d for the first 5 lines. Thus this command gives us the names and sizes of our 5 largest files. The notation introduced above is called the _p_i_p_e mechanism. Commands separated by `|' characters are con- nected together by the shell and the standard output of each is run into the standard input of the next. The leftmost command in a pipeline will normally take its standard input from the terminal and the rightmost will place its standard output on the terminal. Other examples of pipelines will be given later when we discuss the history mechanism; one important use of pipes which is illustrated there is in the - 7 - routing of information to the line printer. _1._6. _F_i_l_e_n_a_m_e_s Many commands to be executed will need the names of files as arguments. UNIX _p_a_t_h_n_a_m_e_s consist of a number of _c_o_m_p_o_n_e_n_t_s separated by `/'. Each component except the last names a directory in which the next component resides, in effect specifying the _p_a_t_h of directories to follow to reach the file. Thus the pathname /etc/motd specifies a file in the directory `etc' which is a subdirec- tory of the _r_o_o_t directory `/'. Within this directory the file named is `motd' which stands for `message of the day'. A _p_a_t_h_n_a_m_e that begins with a slash is said to be an _a_b_s_o_- _l_u_t_e pathname since it is specified from the absolute top of the entire directory hierarchy of the system (the _r_o_o_t). _P_a_t_h_n_a_m_e_s which do not begin with `/' are interpreted as starting in the current _w_o_r_k_i_n_g _d_i_r_e_c_t_o_r_y, which is, by default, your _h_o_m_e directory and can be changed dynamically by the _c_d change directory command. Such pathnames are said to be _r_e_l_a_t_i_v_e to the working directory since they are found by starting in the working directory and descending to lower levels of directories for each _c_o_m_p_o_n_e_n_t of the pathname. If the pathname contains no slashes at all then the file is contained in the working directory itself and the pathname is merely the name of the file in this directory. Absolute pathnames have no relation to the working directory. Most filenames consist of a number of alphanumeric characters and `.'s (periods). In fact, all printing char- acters except `/' (slash) may appear in filenames. It is inconvenient to have most non-alphabetic characters in filenames because many of these have special meaning to the shell. The character `.' (period) is not a shell- metacharacter and is often used to separate the _e_x_t_e_n_s_i_o_n of a file name from the base of the name. Thus prog.c prog.o prog.errs prog.output are four related files. They share a _b_a_s_e portion of a name (a base portion being that part of the name that is left when a trailing `.' and following characters which are not `.' are stripped off). The file `prog.c' might be the source for a C program, the file `prog.o' the corresponding object file, the file `prog.errs' the errors resulting from a compilation of the program and the file `prog.output' the output of a run of the program. If we wished to refer to all four of these files in a command, we could use the notation - 8 - prog.* This word is expanded by the shell, before the command to which it is an argument is executed, into a list of names which begin with `prog.'. The character `*' here matches any sequence (including the empty sequence) of characters in a file name. The names which match are alphabetically sorted and placed in the _a_r_g_u_m_e_n_t _l_i_s_t of the command. Thus the command echo prog.* will echo the names prog.c prog.errs prog.o prog.output Note that the names are in sorted order here, and a dif- ferent order than we listed them above. The _e_c_h_o command receives four words as arguments, even though we only typed one word as as argument directly. The four words were gen- erated by _f_i_l_e_n_a_m_e _e_x_p_a_n_s_i_o_n of the one input word. Other notations for _f_i_l_e_n_a_m_e _e_x_p_a_n_s_i_o_n are also avail- able. The character `?' matches any single character in acan't open file tabs filename. Thus echo ? ?? ??? will echo a line of filenames; first those with one charac- ter names, then those with two character names, and finally those with three character names. The names of each length will be independently sorted. Another mechanism consists of a sequence of characters between `[' and `]'. This metasequence matches any single character from the enclosed set. Thus prog.[co] will match prog.c prog.o in the example above. We can also place two characters around a `-' in this notation to denote a range. Thus chap.[1-5] might match files chap.1 chap.2 chap.3 chap.4 chap.5 if they existed. This is shorthand for - 9 - chap.[12345] and otherwise equivalent. An important point to note is that if a list of argu- ment words to a command (an _a_r_g_u_m_e_n_t _l_i_s_t) contains filename expansion syntax, and if this filename expansion syntax fails to match any existing file names, then the shell con- siders this to be an error and prints a diagnostic No match. and does not execute the command. Another very important point is that files with the character `.' at the beginning are treated specially. Nei- ther `*' or `?' or the `[' `]' mechanism will match it. This prevents accidental matching of the filenames `.' and `..' in the working directory which have special meaning to the system, as well as other files such as ._c_s_h_r_c which are not normally visible. We will discuss the special role of the file ._c_s_h_r_c later. Another filename expansion mechanism gives access to the pathname of the _h_o_m_e directory of other users. This notation consists of the character `~' (tilde) followed by another users' login name. For instance the word `~bill' would map to the pathname `/usr/bill' if the home directory for `bill' was `/usr/bill'. Since, on large systems, users may have login directories scattered over many different disk volumes with different prefix directory names, this notation provides a reliable way of accessing the files of other users. A special case of this notation consists of a `~' alone, e.g. `~/mbox'. This notation is expanded by the shell into the file `mbox' in your _h_o_m_e directory, i.e. into `/usr/bill/mbox' for me on Ernie Co-vax, the UCB Computer Science Department VAX machine, where this document was prepared. This can be very useful if you have used _c_d to change to another directory and have found a file you wish to copy using _c_p. If I give the command cp thatfile ~ the shell will expand this command to cp thatfile /usr/bill since my home directory is /usr/bill. There also exists a mechanism using the characters `{' and `}' for abbreviating a set of words which have common parts but cannot be abbreviated by the above mechanisms - 10 - because they are not files, are the names of files which do not yet exist, are not thus conveniently described. This mechanism will be described much later, in section 4.2, as it is used less frequently. _1._7. _Q_u_o_t_a_t_i_o_n We have already seen a number of metacharacters used by the shell. These metacharacters pose a problem in that we cannot use them directly as parts of words. Thus the com- mand echo * will not echo the character `*'. It will either echo an sorted list of filenames in the current _w_o_r_k_i_n_g _d_i_r_e_c_t_o_r_y, or print the message `No match' if there are no files in the working directory. The recommended mechanism for placing characters which are neither numbers, digits, `/', `.' or `-' in an argument word to a command is to enclose it with single quotation characters `'', i.e. echo '*' There is one special character `!' which is used by the _h_i_s_- _t_o_r_y mechanism of the shell and which cannot be _e_s_c_a_p_e_d by placing it within `'' characters. It and the character `'' itself can be preceded by a single `\' to prevent their spe- cial meaning. Thus echo \'\! prints '! These two mechanisms suffice to place any printing character into a word which is an argument to a shell command. They can be combined, as in echo \''*' which prints '* since the first `\' escaped the first `'' and the `*' was enclosed between `'' characters. - 11 - _1._8. _T_e_r_m_i_n_a_t_i_n_g _c_o_m_m_a_n_d_s When you are executing a command and the shell is wait- ing for it to complete there are several ways to force it to stop. For instance if you type the command cat /etc/passwd the system will print a copy of a list of all users of the system on your terminal. This is likely to continue for several minutes unless you stop it. You can send an INTER- RUPT _s_i_g_n_a_l to the _c_a_t command by typing the DEL or RUBOUT key on your terminal.* Since _c_a_t does not take any precau- tions to avoid or otherwise handle this signal the INTERRUPT will cause it to terminate. The shell notices that _c_a_t has terminated and prompts you again with `% '. If you hit INTERRUPT again, the shell will just repeat its prompt since it handles INTERRUPT signals and chooses to continue to exe- cute commands rather than terminating like _c_a_t did, which would have the effect of logging you out. Another way in which many programs terminate is when they get an end-of-file from their standard input. Thus the _m_a_i_l program in the first example above was terminated when we typed a |^D which generates an end-of-file from the stan- dard input. The shell also terminates when it gets an end- of-file printing `logout'; UNIX then logs you off the sys- tem. Since this means that typing too many |^D's can accidentally log us off, the shell has a mechanism for preventing this. This _i_g_n_o_r_e_e_o_f option will be discussed in section 2.2. If a command has its standard input redirected from a file, then it will normally terminate when it reaches the end of this file. Thus if we execute mail bill < prepared.text the mail command will terminate without our typing a |^D. This is because it read to the end-of-file of our file `prepared.text' in which we placed a message for `bill' with an editor program. We could also have done cat prepared.text | mail bill since the _c_a_t command would then have written the text through the pipe to the standard input of the mail command. When the _c_a_t command completed it would have terminated, closing down the pipeline and the _m_a_i_l command would have received an end-of-file from it and terminated. Using a __________________________ *Many users use _s_t_t_y(1) to change the interrupt charac- ter to |^C. - 12 - pipe here is more complicated than redirecting input so we would more likely use the first form. These commands could also have been stopped by sending an INTERRUPT. Another possibility for stopping a command is to suspend its execution temporarily, with the possibility of continuing execution later. This is done by sending a STOP signal via typing a |^Z. This signal causes all commands running on the terminal (usually one but more if a pipeline is executing) to become suspended. The shell notices that the command(s) have been suspended, types `Stopped' and then prompts for a new command. The previously executing command has been suspended, but otherwise unaffected by the STOP signal. Any other commands can be executed while the origi- nal command remains suspended. The suspended command can be continued using the _f_g command with no arguments. The shell will then retype the command to remind you which command is being continued, and cause the command to resume execution. Unless any input files in use by the suspended command have been changed in the meantime, the suspension has no effect whatsoever on the execution of the command. This feature can be very useful during editing, when you need to look at another file before continuing. An example of command suspension follows. _2. _D_e_t_a_i_l_s _o_n _t_h_e _s_h_e_l_l _f_o_r _t_e_r_m_i_n_a_l _u_s_e_r_s _2._1. _S_h_e_l_l _s_t_a_r_t_u_p _a_n_d _t_e_r_m_i_n_a_t_i_o_n When you login, the shell is started by the system in your _h_o_m_e directory and begins by reading commands from a file ._c_s_h_r_c in this directory. All shells which you may start during your terminal session will read from this file. We will later see what kinds of commands are usefully placed there. For now we need not have this file and the shell does not complain about its absence. A _l_o_g_i_n _s_h_e_l_l, executed after you login to the system, will, after it reads commands from ._c_s_h_r_c, read commands from a file ._l_o_g_i_n also in your home directory. This file contains commands which you wish to do each time you login to the UNIX system. My ._l_o_g_i_n file looks something like: set ignoreeof set mail=(/usr/spool/mail/bill) echo "${prompt}users" ; users alias ts \ 'set noglob ; eval `tset -s -m dialup:c100rv4pna -m plugboard:?hp2621nl *`'; ts; stty intr |^C kill |^U crt set time=15 history=10 msgs -f if (-e $mail) then echo "${prompt}mail" mail endif This file contains several commands to be executed by UNIX each time I login. The first is a _s_e_t command which is interpreted directly by the shell. It sets the shell vari- able _i_g_n_o_r_e_e_o_f which causes the shell to not log me off if I hit |^D. Rather, I use the _l_o_g_o_u_t command to log off of the system. By setting the _m_a_i_l variable, I ask the shell to watch for incoming mail to me. Every 5 minutes the shell looks for this file and tells me if more mail has arrived there. An alternative to this is to put the command biff y in place of this _s_e_t; this will cause me to be notified immediately when mail arrives, and to be shown the first few lines of the new message. Next I set the shell variable `time' to `15' causing the shell to automatically print out statistics lines for commands which execute for at least 15 seconds of CPU time. The variable `history' is set to 10 indicating that I want the shell to remember the last 10 commands I type in its _h_i_s_t_o_r_y _l_i_s_t, (described later). March 11, 1984 - 2 - I create an _a_l_i_a_s ``ts'' which executes a _t_s_e_t(1) com- mand setting up the modes of the terminal. The parameters to _t_s_e_t indicate the kinds of terminal which I usually use when not on a hardwired port. I then execute ``ts'' and also use the _s_t_t_y command to change the interrupt character to |^C and the line kill character to |^U. I then run the `msgs' program, which provides me with any system messages which I have not seen before; the `-f' option here prevents it from telling me anything if there are no new messages. Finally, if my mailbox file exists, then I run the `mail' program to process my mail. When the `mail' and `msgs' programs finish, the shell will finish processing my ._l_o_g_i_n file and begin reading com- mands from the terminal, prompting for each with `% '. When I log off (by giving the _l_o_g_o_u_t command) the shell will print `logout' and execute commands from the file `.logout' if it exists in my home directory. After that the shell will terminate and UNIX will log me off the system. If the system is not going down, I will receive a new login mes- sage. In any case, after the `logout' message the shell is committed to terminating and will take no further input from my terminal. _2._2. _S_h_e_l_l _v_a_r_i_a_b_l_e_s The shell maintains a set of _v_a_r_i_a_b_l_e_s. We saw above the variables _h_i_s_t_o_r_y and _t_i_m_e which had values `10' and `15'. In fact, each shell variable has as value an array of zero or more _s_t_r_i_n_g_s. Shell variables may be assigned values by the set command. It has several forms, the most useful of which was given above and is set name=value Shell variables may be used to store values which are to be used in commands later through a substitution mechan- ism. The shell variables most commonly referenced are, how- ever, those which the shell itself refers to. By changing the values of these variables one can directly affect the behavior of the shell. One of the most important variables is the variable _p_a_t_h. This variable contains a sequence of directory names where the shell searches for commands. The _s_e_t command with no arguments shows the value of all variables currently defined (we usually say _s_e_t) in the shell. The default value for path will be shown by _s_e_t to be March 11, 1984 _3. _S_h_e_l_l _c_o_n_t_r_o_l _s_t_r_u_c_t_u_r_e_s _a_n_d _c_o_m_m_a_n_d _s_c_r_i_p_t_s _3._1. _I_n_t_r_o_d_u_c_t_i_o_n It is possible to place commands in files and to cause shells to be invoked to read and execute commands from these files, which are called _s_h_e_l_l _s_c_r_i_p_t_s. We here detail those features of the shell useful to the writers of such scripts. _3._2. _M_a_k_e It is important to first note what shell scripts are _n_o_t useful for. There is a program called _m_a_k_e which is very useful for maintaining a group of related files or per- forming sets of operations on related files. For instance a large program consisting of one or more files can have its dependencies described in a _m_a_k_e_f_i_l_e which contains defini- tions of the commands used to create these different files when changes occur. Definitions of the means for printing listings, cleaning up the directory in which the files reside, and installing the resultant programs are easily, and most appropriately placed in this _m_a_k_e_f_i_l_e. This format is superior and preferable to maintaining a group of shell procedures to maintain these files. Similarly when working on a document a _m_a_k_e_f_i_l_e may be created which defines how different versions of the document are to be created and which options of _n_r_o_f_f or _t_r_o_f_f are appropriate. _3._3. _I_n_v_o_c_a_t_i_o_n _a_n_d _t_h_e _a_r_g_v _v_a_r_i_a_b_l_e A _c_s_h command script may be interpreted by saying % csh script ... where _s_c_r_i_p_t is the name of the file containing a group of _c_s_h commands and `...' is replaced by a sequence of argu- ments. The shell places these arguments in the variable _a_r_g_v and then begins to read commands from the script. These parameters are then available through the same mechan- isms which are used to reference any other shell variables. If you make the file `script' executable by doing chmod 755 script and place a shell comment at the beginning of the shell script (i.e. begin the file with a `#' character) then a `/bin/csh' will automatically be invoked to execute `script' when you type script March 11, 1984 - 2 - If the file does not begin with a `#' then the standard shell `/bin/sh' will be used to execute it. This allows you to convert your older shell scripts to use _c_s_h at your con- venience. _3._4. _V_a_r_i_a_b_l_e _s_u_b_s_t_i_t_u_t_i_o_n After each input line is broken into words and history substitutions are done on it, the input line is parsed into distinct commands. Before each command is executed a mechanism know as _v_a_r_i_a_b_l_e _s_u_b_s_t_i_t_u_t_i_o_n is done on these words. Keyed by the character `$' this substitution replaces the names of variables by their values. Thus echo $argv when placed in a command script would cause the current value of the variable _a_r_g_v to be echoed to the output of the shell script. It is an error for _a_r_g_v to be unset at this point. A number of notations are provided for accessing com- ponents and attributes of variables. The notation $?name expands to `1' if name is _s_e_t or to `0' if name is not _s_e_t. It is the fundamental mechanism used for checking whether particular variables have been assigned values. All other forms of reference to undefined variables cause errors. The notation $#name expands to the number of elements in the variable _n_a_m_e. Thus % set argv=(a b c) % echo $?argv 1 % echo $#argv 3 % unset argv % echo $?argv 0 % echo $argv Undefined variable: argv. % It is also possible to access the components of a vari- able which has several values. Thus March 11, 1984 - 3 - $argv[1] gives the first component of _a_r_g_v or in the example above `a'. Similarly $argv[$#argv] would give `c', and $argv[1-2] would give `a b'. Other notations useful in shell scripts are $_n where _n is an integer as a shorthand for $argv[_n] the _n_t_h parameter and $* which is a shorthand for $argv The form $$ expands to the process number of the current shell. Since this process number is unique in the system it can be used in generation of unique temporary file names. The form $< is quite special and is replaced by the next line of input read from the shell's standard input (not the script it is reading). This is useful for writing shell scripts that are interactive, reading commands from the terminal, or even writing a shell script that acts as a filter, reading lines from its input file. Thus the sequence echo 'yes or no?\c' set a=($<) would write out the prompt `yes or no?' without a newline and then read the answer into the variable `a'. In this case `$#a' would be `0' if either a blank line or end-of- file (|^D) was typed. One minor difference between `$_n' and `$argv[_n]' should March 11, 1984 - 4 - be noted here. The form `$argv[_n]' will yield an error if _n is not in the range `1-$#argv' while `$n' will never yield an out of range subscript error. This is for compatibility with the way older shells handled parameters. Another important point is that it is never an error to give a subrange of the form `n-'; if there are less than _n components of the given variable then no words are substi- tuted. A range of the form `m-n' likewise returns an empty vector without giving an error when _m exceeds the number of elements of the given variable, provided the subscript _n is in range. _3._5. _E_x_p_r_e_s_s_i_o_n_s In order for interesting shell scripts to be con- structed it must be possible to evaluate expressions in the shell based on the values of variables. In fact, all the arithmetic operations of the language C are available in the shell with the same precedence that they have in C. In par- ticular, the operations `==' and `!=' compare strings and the operators `&&' and `||' implement the boolean and/or operations. The special operators `=~' and `!~' are similar to `==' and `!=' except that the string on the right side can have pattern matching characters (like *, ? or []) and the test is whether the string on the left matches the pat- tern on the right. The shell also allows file enquiries of the form -? filename where `?' is replace by a number of single characters. For instance the expression primitive -e filename tell whether the file `filename' exists. Other primitives test for read, write and execute access to the file, whether it is a directory, or has non-zero length. It is possible to test whether a command terminates normally, by a primitive of the form `{ command }' which returns true, i.e. `1' if the command succeeds exiting nor- mally with exit status 0, or `0' if the command terminates abnormally or with exit status non-zero. If more detailed information about the execution status of a command is required, it can be executed and the variable `$status' examined in the next command. Since `$status' is set by every command, it is very transient. It can be saved if it is inconvenient to use it only in the single immediately following command. For a full list of expression components available see March 11, 1984 - 5 - the manual section for the shell. _3._6. _S_a_m_p_l_e _s_h_e_l_l _s_c_r_i_p_t A sample shell script which makes use of the expression mechanism of the shell and some of its control structure follows: % cat copyc # # Copyc copies those C programs in the specified list # to the directory ~/backup if they differ from the files # already in ~/backup # set noglob foreach i ($argv) if ($i !~ *.c) continue # not a .c file so do nothing if (! -r ~/backup/$i:t) then echo $i:t not in backup... not cp\'ed continue endif cmp -s $i ~/backup/$i:t # to set $status if ($status != 0) then echo new backup of $i cp $i ~/backup/$i:t endif end This script makes use of the _f_o_r_e_a_c_h command, which causes the shell to execute the commands between the _f_o_r_e_a_c_h and the matching _e_n_d for each of the values given between `(' and `)' with the named variable, in this case `i' set to successive values in the list. Within this loop we may use the command _b_r_e_a_k to stop executing the loop and _c_o_n_t_i_n_u_e to prematurely terminate one iteration and begin the next. After the _f_o_r_e_a_c_h loop the iteration variable (_i in this case) has the value at the last iteration. We set the variable _n_o_g_l_o_b here to prevent filename expansion of the members of _a_r_g_v. This is a good idea, in general, if the arguments to a shell script are filenames which have already been expanded or if the arguments may contain filename expansion metacharacters. It is also pos- sible to quote each use of a `$' variable expansion, but this is harder and less reliable. The other control construct used here is a statement of the form March 11, 1984 - 6 - if ( expression ) then command ... endif The placement of the keywords here is _n_o_t flexible due to the current implementation of the shell. The shell does have another form of the if statement of the form if ( expression ) command which can be written if ( expression ) \ command Here we have escaped the newline for the sake of appearance. The command must not involve `|', `&' or `;' and must not be another control command. The second form requires the final `\' to _i_m_m_e_d_i_a_t_e_l_y precede the end-of-line. The more general _i_f statements above also admit a sequence of _e_l_s_e-_i_f pairs followed by a single _e_l_s_e and an _e_n_d_i_f, e.g.: if ( expression ) then commands else if (expression ) then commands ... else commands endif Another important mechanism used in shell scripts is the `:' modifier. We can use the modifier `:r' here to __________________________ The following two formats are not currently acceptable to the shell: if ( expression ) # Won't work! then command ... endif and if ( expression ) then command endif # Won't work March 11, 1984 - 7 - extract a root of a filename or `:e' to extract the _e_x_t_e_n_- _s_i_o_n. Thus if the variable _i has the value `/mnt/foo.bar' then % echo $i $i:r $i:e /mnt/foo.bar /mnt/foo bar % shows how the `:r' modifier strips off the trailing `.bar' and the the `:e' modifier leaves only the `bar'. Other modifiers will take off the last component of a pathname leaving the head `:h' or all but the last component of a pathname leaving the tail `:t'. These modifiers are fully described in the _c_s_h manual pages in the programmers manual. It is also possible to use the _c_o_m_m_a_n_d _s_u_b_s_t_i_t_u_t_i_o_n mechan- ism described in the next major section to perform modifica- tions on strings to then reenter the shells environment. Since each usage of this mechanism involves the creation of a new process, it is much more expensive to use than the `:' modification mechanism.# Finally, we note that the character `#' lexically introduces a shell comment in shell scripts (but not from the terminal). All subsequent characters on the input line after a `#' are discarded by the shell. This character can be quoted using `'' or `\' to place it in an argument word. _3._7. _O_t_h_e_r _c_o_n_t_r_o_l _s_t_r_u_c_t_u_r_e_s The shell also has control structures _w_h_i_l_e and _s_w_i_t_c_h similar to those of C. These take the forms while ( expression ) commands end and __________________________ #It is also important to note that the current imple- mentation of the shell limits the number of `:' modif- iers on a `$' substitution to 1. Thus % echo $i $i:h:t /a/b/c /a/b:t % does not do what one would expect. March 11, 1984 - 8 - switch ( word ) case str1: commands breaksw ... case strn: commands breaksw default: commands breaksw endsw For details see the manual section for _c_s_h. C programmers should note that we use _b_r_e_a_k_s_w to exit from a _s_w_i_t_c_h while _b_r_e_a_k exits a _w_h_i_l_e or _f_o_r_e_a_c_h loop. A common mistake to make in _c_s_h scripts is to use _b_r_e_a_k rather than _b_r_e_a_k_s_w in switches. Finally, _c_s_h allows a _g_o_t_o statement, with labels look- ing like they do in C, i.e.: loop: commands goto loop _3._8. _S_u_p_p_l_y_i_n_g _i_n_p_u_t _t_o _c_o_m_m_a_n_d_s Commands run from shell scripts receive by default the standard input of the shell which is running the script. This is different from previous shells running under UNIX. It allows shell scripts to fully participate in pipelines, but mandates extra notation for commands which are to take inline data. Thus we need a metanotation for supplying inline data to commands in shell scripts. As an example, consider this script which runs the editor to delete leading blanks from the lines in each argument file March 11, 1984 - 9 - % cat deblank # deblank -- remove leading blanks foreach i ($argv) ed - $i << 'EOF' 1,$s/|^[ ]*// w q 'EOF' end % The notation `<< 'EOF'' means that the standard input for the _e_d command is to come from the text in the shell script file up to the next line consisting of exactly `'EOF''. The fact that the `EOF' is enclosed in `'' characters, i.e. quoted, causes the shell to not perform variable substitu- tion on the intervening lines. In general, if any part of the word following the `<<' which the shell uses to ter- minate the text to be given to the command is quoted then these substitutions will not be performed. In this case since we used the form `1,$' in our editor script we needed to insure that this `$' was not variable substituted. We could also have insured this by preceding the `$' here with a `\', i.e.: 1,\$s/|^[ ]*// but quoting the `EOF' terminator is a more reliable way of achieving the same thing. _3._9. _C_a_t_c_h_i_n_g _i_n_t_e_r_r_u_p_t_s If our shell script creates temporary files, we may wish to catch interruptions of the shell script so that we can clean up these files. We can then do onintr label where _l_a_b_e_l is a label in our program. If an interrupt is received the shell will do a `goto label' and we can remove the temporary files and then do an _e_x_i_t command (which is built in to the shell) to exit from the shell script. If we wish to exit with a non-zero status we can do exit(1) e.g. to exit with status `1'. _3._1_0. _W_h_a_t _e_l_s_e? There are other features of the shell useful to writers of shell procedures. The _v_e_r_b_o_s_e and _e_c_h_o options and the related -_v and -_x command line options can be used to help trace the actions of the shell. The -_n option causes the March 11, 1984 - 10 - shell only to read commands and not to execute them and may sometimes be of use. One other thing to note is that _c_s_h will not execute shell scripts which do not begin with the character `#', that is shell scripts that do not begin with a comment. Similarly, the `/bin/sh' on your system may well defer to `csh' to interpret shell scripts which begin with `#'. This allows shell scripts for both shells to live in harmony. There is also another quotation mechanism using `"' which allows only some of the expansion mechanisms we have so far discussed to occur on the quoted string and serves to make this string into a single word as `'' does. March 11, 1984 - 11 - _4. _O_t_h_e_r, _l_e_s_s _c_o_m_m_o_n_l_y _u_s_e_d, _s_h_e_l_l _f_e_a_t_u_r_e_s _4._1. _L_o_o_p_s _a_t _t_h_e _t_e_r_m_i_n_a_l; _v_a_r_i_a_b_l_e_s _a_s _v_e_c_t_o_r_s It is occasionally useful to use the _f_o_r_e_a_c_h control structure at the terminal to aid in performing a number of similar commands. For instance, there were at one point three shells in use on the Cory UNIX system at Cory Hall, `/bin/sh', `/bin/nsh', and `/bin/csh'. To count the number of persons using each shell one could have issued the com- mands % grep -c csh$ /etc/passwd 27 % grep -c nsh$ /etc/passwd 128 % grep -c -v sh$ /etc/passwd 430 % Since these commands are very similar we can use _f_o_r_e_a_c_h to do this more easily. % foreach i ('sh$' 'csh$' '-v sh$') ? grep -c $i /etc/passwd ? end 27 128 430 % Note here that the shell prompts for input with `? ' when reading the body of the loop. Very useful with loops are variables which contain lists of filenames or other words. You can, for example, do % set a=(`ls`) % echo $a csh.n csh.rm % ls csh.n csh.rm % echo $#a 2 % The _s_e_t command here gave the variable _a a list of all the filenames in the current directory as value. We can then iterate over these names to perform any chosen function. The output of a command within ``' characters is con- verted by the shell to a list of words. You can also place the ``' quoted string within `"' characters to take each March 11, 1984 - 2 - (non-empty) line as a component of the variable; preventing the lines from being split into words at blanks and tabs. A modifier `:x' exists which can be used later to expand each component of the variable into another variable splitting it into separate words at embedded blanks and tabs. _4._2. _B_r_a_c_e_s { ... } _i_n _a_r_g_u_m_e_n_t _e_x_p_a_n_s_i_o_n Another form of filename expansion, alluded to before involves the characters `{' and `}'. These characters specify that the contained strings, separated by `,' are to be consecutively substituted into the containing characters and the results expanded left to right. Thus A{str1,str2,...strn}B expands to Astr1B Astr2B ... AstrnB This expansion occurs before the other filename expansions, and may be applied recursively (i.e. nested). The results of each expanded string are sorted separately, left to right order being preserved. The resulting filenames are not required to exist if no other expansion mechanisms are used. This means that this mechanism can be used to generate argu- ments which are not filenames, but which have common parts. A typical use of this would be mkdir ~/{hdrs,retrofit,csh} to make subdirectories `hdrs', `retrofit' and `csh' in your home directory. This mechanism is most useful when the com- mon prefix is longer than in this example, i.e. chown root /usr/{ucb/{ex,edit},lib/{ex?.?*,how_ex}} _4._3. _C_o_m_m_a_n_d _s_u_b_s_t_i_t_u_t_i_o_n A command enclosed in ``' characters is replaced, just before filenames are expanded, by the output from that com- mand. Thus it is possible to do set pwd=`pwd` to save the current directory in the variable _p_w_d or to do ex `grep -l TRACE *.c` to run the editor _e_x supplying as arguments those files whose names end in `.c' which have the string `TRACE' in them.* __________________________ *Command expansion also occurs in input redirected with March 11, 1984 - 3 - _4._4. _O_t_h_e_r _d_e_t_a_i_l_s _n_o_t _c_o_v_e_r_e_d _h_e_r_e In particular circumstances it may be necessary to know the exact nature and order of different substitutions per- formed by the shell. The exact meaning of certain combina- tions of quotations is also occasionally important. These are detailed fully in its manual section. The shell has a number of command line option flags mostly of use in writing UNIX programs, and debugging shell scripts. See the shells manual section for a list of these options. __________________________ `<<' and within `"' quotations. Refer to the shell manual section for full details. March 11, 1984  - 4 - _A_p_p_e_n_d_i_x - _S_p_e_c_i_a_l _c_h_a_r_a_c_t_e_r_s The following table lists the special characters of _c_s_h and the UNIX system, giving for each the section(s) in which it is discussed. A number of these characters also have spe- cial meaning in expressions. See the _c_s_h manual section for a complete list. Syntactic metacharacters ; 2.4 separates commands to be executed sequentially | 1.5 separates commands in a pipeline ( ) 2.2,3.6 brackets expressions and variable values & 2.5 follows commands to be executed without waiting for completion Filename metacharacters / 1.6 separates components of a file's pathname ? 1.6 expansion character matching any single character * 1.6 expansion character matching any sequence of characters [ ] 1.6 expansion sequence matching any single character from a set ~ 1.6 used at the beginning of a filename to indicate home directories { } 4.2 used to specify groups of arguments with common parts Quotation metacharacters \ 1.7 prevents meta-meaning of following single character ' 1.7 prevents meta-meaning of a group of characters " 4.3 like ', but allows variable and command expansion Input/output metacharacters < 1.5 indicates redirected input > 1.3 indicates redirected output Expansion/substitution metacharacters $ 3.4 indicates variable substitution ! 2.3 indicates history substitution : 3.6 precedes substitution modifiers |^ 2.3 used in special forms of history substitution ` 4.3 indicates command substitution Other metacharacters # 1.3,3.6 begins scratch file names; indicates shell comments - 1.2 prefixes option (flag) arguments to commands % 2.6 prefixes job