Only the default shell, the C shell, will be discussed here, with minor references to the differences between it and the other commonly-used shell, the Bourne shell. The default prompt for the C shell is %; for the Bourne shell it is #. The command logout will terminate a C shell; to terminate the Bourne shell, use d.
The UNIX commands described earlier (ls, cat, cd, pwd, history, ...) can all be entered at the C shell prompt, but the shell does more than just pass on acceptable commands to the system for execution. It also provides services described earlier such as expanding wildcards and redirecting standard input and output.
This chapter will describe three other functions the C shell provides: a history mechanism for manipulating commands; a set of process control facilities; and a C-like programming capability. Section 7.5 briefly mentions some topics that are useful but deserve more discussion than we can afford here. More detailed information on the C shell can be found in the online manual pages for csh.
Just as you can recall the last command via !!, you can recall any recent command by issuing ! n where n is the history number of the command to be recalled. Another convenient method of recalling a command is to use ! followed immediately by a string of characters which begins the desired command. You only need to enter enough characters to distinguish it in the history list.
For example, suppose history lists your recent commands as:
1 vi five.chickens 2 cat five.chickens | more 3 ls 4 mv five.chickens test.c 5 cc test.c 6 cflow test.c
At this point the 2 characters "! l" (exclamation and the letter `l') or "! 3" exclamation and the number "three") will repeat the ls command, and "! v" will restart the most recent vi editor command. The shell stops searching at the first match, so "! c" will repeat the cflow command. Two ways of repeating the cat command are to enter "! 2" or "! ca".
Command lines are broken by the shell into individual words--the command proper and an argument list. For example, ls -l * contains the command ls and two arguments, -l and *.
You can see how the shell interprets a command without actually executing it using echo, which types back its argument list showing the shell expansion. The command echo ls -l * types back the first two arguments literally and shows the shell expansion of * into the appropriate file list.
%
echo ls -l *
ls -l Mail five.chickens printit.doc test.c vincent.lat
Errors made while entering commands can be corrected on the next line by use of a shell string replacement feature. The characters to be changed and the characters replacing them are entered surrounded by substitution markers, . Most typing errors are easily fixed:
% cay five.chickens | more
cay: Command not found
% ^cay^cat^
Strings to be replaced can occur anywhere in the original command. The replacement string does not have to be the same length. Neither string may contain spaces. Replacement is not limited to correcting errors. Here is how to edit and print a file on the default printer with a minimum of work.
% vi five.chickens
% ^vi^lpr^
Variable Name Value -------------- ------------------------------- argv () cwd /home/user history 20 home /home/user interactive noclobber path (. /home/user/bindec /usr/vincent /usr/local/bin /usr/new /usr/bin/X11 /usr/ucb /bin /usr/bin) prompt % shell /bin/csh status 0 term dec--vt100 user user -------------- ------------------------------- Table 7.1: Shell Variables
Alternate forms of the set command are used to set the values of variables, the form depending on the number of values assigned to the variable. If the variable holds no value, the form is simply set variablename. This form was used to set the existence of noclobber and interactive.
Single values are assigned by set variablename=value. To change the number of commands stored on the history list to 25, enter set history=25.
Array variables have values assigned to specific positions in the array. The initial position in the array is numbered ``1''. Look at the ``path'' variable shown in the list. This is an array variable whose values are all directory names. (A path is a list of directories that the shell searches through to find and execute the file corresponding to the command you have entered.) The initial position of this variable, path[1], has the value ``.'', meaning the current directory. To set the values of an array variable, separate the values by spaces and enclose the list in parentheses.
set trail=(. 1 /yellow/road)
Care must be taken not to put spaces on either side of the = in the set command. Depending on which side(s) the spaces occur, you could get an error message, have the variable exist with no value, or take on the correct value. It is best to avoid potential trouble by never putting extra spaces in a set command.
Sometimes within a shell, there are processing difficulties because of ambiguity as to where a variable name ends. In these situations, surround the name with braces and the problem disappears.
%
set food=cheese
%
echo $foodcake
foodcake: Undefined variable.
%
echo ${food}cake
cheesecake
To remove a variable, use the unset command (as in unset variablename).
The individual words on the command line are each stored in variables that you can reference. There are two separate referencing conventions. In the first, the command itself is named $0, and each argument is numbered sequentially from there, $1, $2, $3, and so on. $# holds the total number of arguments. This convention has a limit in that names stop at $9. There is a shift command that can be used to work around this limitation but we will not discuss it here. In the command:
ls -l five.*
$#=2, $0=ls, $1=-l, $2=five.*, and $3 through $9 have no value.
The other referencing convention uses the shell array variable argv. This variable was designed to provide more flexibility in referring to commands and arguments than the simple numbering scheme shown above. It is not available in the Bourne shell. See the manual pages for more information on argv.
A last note about variables that reference command line arguments: you can refer to the last word of the last command with the variable $. This is useful in situations where the last word of one command is a name that the next command will manipulate.
%
cp /tmp/dummy/local/example.tex example.tex
%
vi !$
%
lpr !$
Each of the three types of quotation marks has a specific function defined in the C shell.
%
date
Sat July 21 1:36:01 CDT 1990
%
set now=`date`
%
echo ``It is $now''
It is Sat July 21 1:36:31 CDT 1990
%
echo ``What time is it '$now' ?"
What time is it '$now' ?
The previous section hints at a powerful technique in UNIX, command substitution. The general idea is to set a variable to the output of one command and substitute that variable in other commands. Suppose you want to copy all the files from a recent robotics project to a different directory and print them. You had the foresight to have the string robot in all these filenames. That allows grep to locate the files of interest. By storing the output of grep in a variable, you can accomplish both tasks without ever typing a filename.
%
set filelist=`grep -l 'robot' *`
%
cp $filelist /projects/robotics
%
lpr $filelist
Many characters which are neither letters nor digits have special
significance to either the shell or to UNIX. These characters
are called metacharacters and must be
quoted when used in commands in order to use them without their
special meaning. For example ``;''
is used by the shell to separate two commands on one input line. You can
override the usual meaning of these special characters by prefixing them
with a backslash, ``\
''
% ls
the.good the.bad and.the;ugly
% mv and.the\;
ugly and.the.ugly
% ls
the.good the.bad and.the.ugly
UNIX executes processes. You can learn about the processes that are being executed by issuing the process status command ps. Two pieces of information that will be shown are the command name and the process identification number, or PID. Other information is available by including options, (such as -uax on DEC ULTRIX or -aef on HP-UX, DEC UNIX and Solaris) to get full information on all processes.
One process may generate another. The old process is called the parent and the new process is called the child. Your login process is a child of init, a process that executes continuously. Each command you enter starts a new process.
The system performs several initiation steps whenever it starts a new shell. One of these steps is to always execute the commands in .cshrc and, if the process is logging in a new user, to execute the commands in .login. The role played by these files is covered in Section 7.4.
Processes or jobs are normally running in the foreground of your interactive session. You may, however, start processes in the background by adding the & character to the end of the command line. You could then continue entering interactive commands. For example, the command cc test.c & will start compiling a C program named test.c in the background. If you try this with your own C program, you will find that you get the prompt back and can continue working while the compilation progresses. Messages, however, are sent to the terminal and can interfere with what you are doing. To prevent this, redirect the output of background processes into files. Try cc test.c >& out.tmp & to see the difference.
The jobs command will show the status and job number of jobs either running in the background or suspended. You can suspend a job running in the background by entering stop % jobnumber. Or you can bring a job back into the foreground by entering fg % jobnumber.
Processes may be started in the foreground, stopped, and placed in the background. Pressing CTRL-Z suspends the foreground job. Entering bg % jobnumber starts the suspended job executing in the background.
Logging out does not kill jobs that are stopped or running in the background. Jobs and processes are deleted from the active list by using the kill command.
To kill a job, enter:
% kill % jobnumber
If that doesn't work, try:
% kill -9 % jobnumber
To kill a process, enter:
% kill PID
If that doesn't work, try:
% kill -9 % PID
At times it is inconvenient to give interactive commands to the shell. This is especially true when you have a set of commands that you repeat periodically or that you find difficult to key in accurately. It would be a timesaving device to key them in just once. The C shell has the ability to read and execute a series of commands from a file. These files of shell commands are known as shell files, shell procedures, or shell scripts.
This section will discuss the basic facts about script execution, the programming features available inside scripts, and how environment settings are passed down to scripts.
Scripts can be executed by three distinct methods:
The first method starts another copy of the C shell in which to execute your script. Changes the script makes to the environment, such as resetting the current working directory, will not be maintained when the script finishes.
The second method executes the script inside the current shell. All changes are still in effect when the script finishes just as if the commands had been entered interactively.
The third method requires using the chmod command to mark the shell script as an executable file. Then the file name may be entered just as if it were a shell command. Since commands are executed as separate programs, this has the same benefit as method one in protecting the current environment from some unwanted side effects.
To mark the script as executable by the owner, enter: chmod u + x scriptname.
# `This script shows the effects of the methods described above.' # `Execute it from your home directory by the csh method. Check' # `pwd. Compare the pwd value that was printed by the script.' # `If necessary, use cd to return to your home directory,' # `and remove the dummy directory by rmdir ~/dummy.' # `Then repeat with the other two methods.' mkdir ~/dummy cd ~/dummy pwd
Using the -v or -x option on csh command causes the shell to print each line of the script as it is executed. This is a quick way to trace the execution of a script and is a great aid to debugging.
You can also include either or both of the following lines at the begining of your shell script:
set verbose
set echo
Some scripts have CTRL-c disabled, so if you really wish to stop a process - "a sure kill" - you can issue the command, "kill -9 PID" and the process will die.
Yet there are situations where a newly-created shell must inherit information about its environment. For example vi must know if TERM is set in order to function properly as a full-screen editor. The setenv command is used in place of set to create variables that are not local to the shell that creates them. Variables created by setenv, sometime referred to as export variables, are available to all descendants of the creating shell. The syntax of setenv varies from set. Here are some examples:
set setenv ----------------------------------- ----------------------------------- set path = (dir1 dir2 dir3) setenv PATH dir1:dir2:dir3 set term = xterms setenv TERM xterms set printenv ----------------------------------- ----------------------------------- "set" vs "setenv" syntaxYou can print all the environment variables created by setenv, and some of the variables exported by the C shell, with the command printenv.
The C shell derives its name from the fact that it looks so much like the C programming language. Here is a brief overview of some of the programming-like commands available. These are normal C shell commands but are rarely seen outside scripts.
There are two ways of providing data inside scripts. The first is to read it directly from the standard input: set variablename = $<. This is commonly used to allow the script user to provide simple data values interactively.
The second is to provide data lines inside the script. This is normally done to provide a block of data larger than that needed by a simple variable. This method uses a marker, 'EOF', to signify that the end of data has been reached. The 'EOF' ending marker must be alone on the termination line.
Suppose you have developed a program, dcyc, for your poultry research that takes as input the number of eggs laid and projects the number of hatchlings. As part of fully documenting this information, you also input some auxiliary information for the final report. Since you do this many times a day you have developed a script file, bth, that feeds data into the program. By using the script file and recalling history commands, you can quickly run the program with the correct data.
# 'bth: A script file to provide data to dcyc' dcyc << 'EOF' Cyclone Hatchery Report date: July 26, 1990 Eggs laid: 731 'EOF'
Note that the script begins with the first character #, and the rest of this comment line is protected from replacement by apostrophes. The input for dcyc is to be read directly from the input file and includes all the lines up to the next 'EOF'.
More complicated program flow control structures are possible inside scripts
than you would normally use interactively. The C shell supports the familiar
goto and the lesser known &&
and ||,
used
to execute commands if the previous command succeeds or fails.
It also has
if/then,
if/then/else and
switch/endsw constructs.
Each of these structures is described in this section.
The simplest control structure you can use in the C shell is to use labels and goto statements. Any string of identifying characters that begins a command line is treated as a label when followed by a colon (:). The syntax of the goto statement is demonstrated in the following example. Note that the colon identifies error as a label but the colon is not part of the label.
. error: . goto error .
It is helpful to know about the status variable before discussing more complicated structures. Commands submitted to the shell either succeed or fail. If they fail, you receive some error message; if they succeed, they perform the requested operation and perhaps you receive some output on the screen. Internally the shell keeps track of success and failure in a variable named status. When a command succeeds, the shell sets the value of status to 0. Any other value indicates some type of failure.
The symbols &&
and ||
are used to join two commands. The shell executes
the first command and stores its status. The second command is then executed
conditionally. If the first command succeeds,
&&
says to execute the second. If the first command
fails, || says to execute the second. With any other combination,
the second command is ignored.
Suppose the bth script described above always produces an output file named report. If everything performs as expected, this report should be printed. If the script fails for any reason, the report should be deleted. This can be done with one command:
(bth && lpr report) || rm report
The parentheses in this example are used to group
the script and printing
commands into one unit so that both &&
and ||
are joining only two
commands. We will trace the execution of these commands,
first assuming the
script succeeds, then assuming the script fails.
If the bth script succeeds, status is set to 0, and the report is printed. This leaves the status 0, so the remove command (rm) is ignored.
On the other hand, if the script fails, status is not 0, and the print request is ignored. This leaves the status unchanged, so the remove command (rm) is executed.
In more complicated situations it may be necessary to execute a group of commands based on some condition. This can be done using an if/then control structure. The syntax is:
if <condition> then <list of commands> endif
If the condition determines a choice between alternate groups of commands then you would use an if/then/else instead.
if <condition> then <list of commands> else <list of commands> endif
Both constructs require that then appear on the same line as if. In the Bourne shell then appears on the first line of commands to execute, and endif is replaced by fi.
It is possible to nest if/then/else statements. One good way to organize nested if/then statements is:
if <condition1> then <list of commands1> else if <condition2> then <list of commands2> else if <condition3> then <list of commands3> . . . endif
The condition described above can be:
&&
or ||
operators with the order of evaluation dictated
by grouping the tests in parentheses.
!<condition>
. If the
condition normally evaluates to true, the negation evaluates to
false.
If the
condition normally evaluates to false, the negation evaluates to
true.
When num1 and num2 are numbers or numeric variables, the numeric tests and the logical results are:
num1 == num2 True if num1 is equal to num2. num1 != num2 True if num1 is not equal to num2. num1 < num2 True if num1 is less than num2. num1 <= num2 True if num1 is less than or equal to num2. num1 > num2 True if num1 is greater than num2. num1 >= num2 True if num1 is greater than or equal to num2.
When string1 and string2 are quoted strings or string variables, the string tests and the logical results are:
string1 == string2 True if string1 is equal to string2. string1 != string2 True if string1 is not equal to string2. string1 =~ string2 True if string1 matches the metacharacter pattern specified by string2. string1 !~ string2 True if string1 does not match the metacharacter pattern specified by string2.When filename is a file name, the legitimate tests and the logical results are:
-e filename True if the file exists. -r filename True if the file is readable. -w filename True if the file is writable. -x filename True if the file is executable. -o filename True if the user owns the file. -z filename True if the file is empty. -d filename True if the file is a directory. -f filename True if the file is not a directory.
Returning to the chicken and egg example discussed before, suppose you wanted to automate the report printing step and as well as designing the script to cover nasty possibilities that would destroy existing reports or leave behind unnecessary files. With the tools just described it is possible to design a new improved version of the bth script that will not execute dcyc if the report file exists. When dcyc creates a bad (empty) report, it is automatically removed. Only good reports are printed, and then the report file is deleted to conserve space.
# 'bth2: An improved script file to provide data to dcyc,' # 'and ensure that the report files are handled cleanly.' if ((-e report && -d report) || (-e report && -f report)) then echo "There is a directory or file named report!" else dcyc << 'EOF' Cyclone Hatchery Report date: July 26, 1990 Eggs laid: 731 'EOF' if (-e report && -z report) then echo "An empty report was generated! I will remove it." rm report else (lpr report && rm report) endif endif
Since any existing object named report would normally be a
file or directory the condition in the first if statement
could be condensed to (-e report)
. You may or may not
decide to simplify scripts once they are written. Often trying to
simplify one section causes unforseen problems in another. The
important thing is to make them as easy to understand as possible.
Comments are one means of making scripts easy to follow. Indentation is another. Note the nested if/then/else above. The indentation clearly shows that it is inside the else portion of the outer if/then/else structure. It also allows you to quickly match an endif to the appropriate if. The shell will always pair an endif with the last unpaired if, leading to unpredictable results if one has been inadvertently omitted.
The final flow control construction covered here is the switch/endsw statement. It tests the value of a variable against several strings to find a match. The comparison may involve wildcards, * and ?, and character classes, [-]. One case is selected from a list of cases and the group of commands related to that case is executed.
The group of commands to be executed is terminated by a breaksw command. Omitting the ending breaksw allows execution to fall through to the next case. This is a common bug in writing switch statements, and some programmers use it as a feature. The default group of commands is executed if the switchvar does not match any of the test strings. The syntax of this command is:
switch (<$switchvar>) case string1: <list of commands> breaksw case string2: <list of commands> breaksw . . . default: <list of commands> breaksw endsw
The C shell also provides two looping mechanisms. The first mechanism is the conditional loop, implemented by the while statement, which repeatedly executes a list of commands while some condition is true. The condition is constructed following the same rules as for conditions in an if/then statement. The syntax is:
while <condition> <list of commands> end
Care must be taken to insure that one of the commands in the list eventually makes the condition false, providing an exit for the conditional loop. In case of infinite looping you can regain control by entering C.
The second looping mechanism is the iterative loop, implemented by the foreach statement, which executes a list of commands while some control variable takes on each value from a specific list of values. The syntax is:
foreach <controlvariable> (control-list) <list of commands> end
The control list can be a list of numeric or string values, a file list or even the command's argument list represented as $argv[*]. There is little danger of infinite looping with this construct since the number of times the list of commands can be executed is limited by the number of items in the control list.
Shells can perform simple integer arithmetic. The operations are limited to addition, subtraction, multiplication, integer division and a remainder function. The operands must be numeric constants or numeric variables. All mathematical operations must be signaled to the shell by prefixing the command with an at sign (@). Table 7.2 shows the syntax of some of the various operations.
Operation Syntax --------------- ----------------------- addition @ num1=num2 + num3 subtraction @ num1=num2 - num3 multiplication @ num1=num2 * num3 division @ num1=num2 / num3 remainder @ num1=num2 % num3 --------------- ----------------------- Table 7.2: Arithmetic Operations
There is a standard method for naming these temporary files that ensures
your files will have unique names inside /tmp. The name is
obtained by concatenating the name of the your script file and your PID.
The name of your script file is stored in the variable ${0},
and the PID is in $$, so ${0}$$ translates into the
name we are after. The full specification for the file is
/tmp/${
0}
$$.
Remember this is the name of a temporary file to be used inside a script.
If you wish to throw away output without seeing it, for example, during the initial development phase of some project when only garbage is being produced, you can redirect it into the special file /dev/null.
Standard UNIX documentation suggests customizing your environment by modifying your .login and .cshrc files.
These files contain shell commands but are not scripts to be executed. They don't need to be made executable. You need to be very careful when modifying these files. If you make any mistakes modifying your .cshrc or .login files, you may not be able to login back in to your account.
One of the most popular customizations is aliasing, a technique that allows users to define their own names for commands. The syntax of the command is:
alias aliasname <any command and option list>
For example,
alias rm "rm -i"
Each window maintains its own shell process, so to make your aliases available in all of them, your .cshrc must contain your alias commands.
If you are an experienced user on some other system, you might alias command names you are more familiar with to their UNIX equivalents. Type alias to see all your current aliases.
You can change your default shell by entering the command chsh followed by the location of the shell you wish to use. You can pick the Bourne shell by entering: chsh /bin/sh. The C shell is in /bin/csh; the System V shell is in /bin/sh5.
There are many other features available that require more space to discuss than we can devote here. Some of these features are standard parts of UNIX. Others, like make, are enhancements that have become so prevalent that they are now regarded as standard. Several of these additional features are mentioned here; these are more fully documented in man or in printed documentation.
There is a utility called make that manages files that in a sense produce a master file. The files may be source code files that must be compiled and linked together to form an executable file, or they may be sections of a large document that you periodically update. You describe the dependencies in a file called a makefile. If the files were source code, the makefile would contain the commands to create the executable version. After modifying one or more of the files you issue the command make makefile, and make will perform the minimal work required to produce the required master file.
There is also a touch command that will make no changes to a file other than to modify its last use date. This is often used in conjunction with make to force the updating of certain files that make ordinarily would not update.
You can control how and if metacharacters are expanded with a variable named noglob.
Several utilities are available to assist in writing documents. Among these are spell, style, diction and explain. The spell utility checks documents for misspelled words and is the most useful of the group. One use for multiple windows is to let spell check a document in one window while you make corrections in another.
The style utility rates the readability of your document; diction finds words and phrases that are often misused or overused; explain suggests alternatives for the words and phrases diction finds. A natural use for three windows is one each for an editor, diction, and explain.
If you set ``noclobber'' in your shell environment, then in theory you should not be able to overwrite existing files without being warned. For example, if file2 already existed, and you executed the following command:
% cp file1 file2You should be prompted with:
overwrite file2?However, you should not rely on noclobber to protect existing files from being overwritten.