Shells
Or How I Learned to Stop Worrying and Get Work Done
---------------------------------------------------

What is a Shell?
================

In Unix, a shell is a command line interface that serves to take
input in the form of commands from you, the user, and transform
that into a form that ``Unix'' can understand as well as take the
output that Unix returns and turn that into something reasonably
understandable to the user.

A good description of a shell come from zsh(1) :
	
    Zsh is a Unix command interpreter (shell) useable as an
    interactive login shell and as a shell script command processor ...

In the Unix world, there are several shells that are generally
available to use on any given system.  These are the Bourne Shell,
/bin/sh, the Berkeley C Shell, /bin/csh, and the Korn Shell,
/bin/ksh.  There are also other shells that, while not always part
of the standard OS distribution, are freely available and readily
installed; these include the Tenex C Shell, tcsh, the Bourne Again
Shell, bash, and the Z Shell, zsh.

Shells have a number of features and characteristics that may make
each suited to some purposes and less well suited for others.  All
feature some sort of built-in programming syntax similar to that
seen in high-level programming languages, command aliasing and
filename 'globbing'.  Other shells provide command and filename
completion, job control, command line editing, command history and
other ``general useful thingies''.

Shells based on /bin/sh tend to provide a very strong and powerful
programming syntax with many of features that are considered needed
to produce structured programs.  It especially has very strong
input and output handling capabilities, something lacking in shells
based on csh.  However, these shells have earned a reputation --
especially /bin/sh itself -- for providing a very weak interactive
environment with little concessions to the user at the commandline.
Shells such as /bin/sh, ksh, and bash fall into this category.
Shells based on /bin/csh provide a very powerful interactive
environment that tend to reduce the amount of typing a person needs
to do to accomplish a certain amount of work.  In addition, the
built-in programming syntax is much more like the C programming
language than that of /bin/sh.  But at the same time, the programming
syntax -- while easy to learn -- is not as powerful as that of
/bin/sh based shells, especially in the area of input and output
handling.  This weakness will be examined later on.  Shells such
as /bin/csh and tcsh fall into this category.

As time goes on, the difference between /bin/sh based shells and
/bin/csh based shells gets smaller as features are borrowed and
integrated.  Bash, while a Bourne compatible shell, borrows some
features from csh, and zsh, while also a Bourne based shell, also
incorporates features from csh.


===============
C Shell Basics:
===============

Commands
======== 
At the most basic level, all shells are alike for the most part.
You type one command, press enter, wait, and you get something back
or some result.  And you do this again and again until you're done.

In most shells, you can also combine commands.  So instead of
something like this:

    % date
    % cal 10 1996
    % finger root@csua

you could join commands with semicolons to put it all on one line.

    % date; cal 10 1996; finger root@csua

In csh you can also try stuff like this:

    % (cd work/helper/inter; ls ; head handout.txt)
 
This has the effect of cd'ing in to the directory tmp and running
the rest of the command line and then returning you back to the
directory you started from.  The use of parentheses also has other
effects on how command output is handled.  More on that later.


File "Globbing"
===============

Sometimes, you will have a lot of files that you want to do something
to all at the same time, like this:

    % grep X11 ansihead.mak bc.mak bcflags.mak bclib.mak bcwin.mak ...

That is a lot of typing to do and you might end up making some typing
mistakes.  You could instead try this:

    % grep X11 *.mak

The "*" character, called "star", can be used in the shell to represent
any number of characters, including zero characters.  The "?" does
something similar except it can only be used to represent one non-null
character.  Brackets "[]" can be used to represent ranges or lists of
characters.  Here are some examples and other ways to "glob" filenames:

    % ls .??*
    % ls book-[a-z]
    % ls [Cc][Hh][Aa][Pp][Tt][Ee][Rr]*
    % mkdir lab.{1,2,3,4,5,6}

That last one is actually a little different.  You can use the braces to
specify filenames that do not exist yet.  If you tried a similar command
with brackets instead, and the files did not exist yet, you would get an
error message:

    % mkdir lab.[1-6]
    mkdir: No match.

Output Redirection:

Now, as you can see, when you execute certain commands, you sometimes get
some output from the command.  For instance, the output of the ls command
is the listing of all the files in the directory that you specify.
Sometimes, you find that you want to save the results of your work into a
file.  You can do that with output redirection:

    % grep csh /etc/passwd > csh-users

This will look for lines containing the text pattern csh in the file
/etc/passwd and save the results to the file csh-users.  There is also a
way to append to a file that already exists:

    % wg jon >> slander.logs

This will take the output from the command wg jon, a small command that
I wrote to search for mention of my name (or rather the pattern jon) in a
log file, and append that to the end of the file slander.logs

Now, output redirection can be a dangerous thing.  Here is an example:

    % grep .\*:.\*:.\*:.\*:.\*:.\*/bin/tcsh /etc/passwd > csh-users

This looks for the pattern tcsh in the file /etc/passwd and saves the output to csh-users.  But I meant to type sh-users, not tcsh-users.  OOPS.  I have now overwritten the contents of csh-users with the output of that command up there.  Here is another example of output redirection gone amok:

    % cat book book.1 book.2 book.3 > book

This is a little more subtle.  As csh processes the command, it sees that you are about to write to the file called book, so it clears it out to prepare it to be written to. So the file book is emptied out before the command cat ever gets to look at the contents and all it sees is an empty file.  This is not usually something to be desired.  Now look at this one:

    % wg jon >> slander.slogs

I meant to type slander.logs instead.  But now, instead of appending the output from wg to slander.logs, it is now deposited in the once non-existent file slander.slogs.  If I don't catch my typing mistake, I might not notice that I saved that output there and might never see the wonderful things that people might say about me.  In order to prevent this mayhem, csh and tcsh (and I assume other shells) provide a means by which you can protect yourself from your own mistakes.

    % set noclobber

will prevent you from 'clobbering' your own files as well as keep you from accidentally creating unintended files.  Noclobber is what is called a shell variable.  Shell variables are variables that can be set, sometimes to contain a value, and that only have meaning to the shell.  There are many shell variables as well as environment variables.  I will talk more about them later.  If you ever need to override noclobber, you can try anyone of the following:

    % unset noclobber; some_command > existing_file; set noclobber
    or
    % some_command >! existing_file
    or 
    % (unset noclobber; some_command > existing_file)

The examples work for file-append as well.

Here's something to try:

    % (date; cal 10 1996; finger root@csua) > root-and-calendar

Input
=====
In Unix, in addition to the concept of output, there is also input.  You might do the
following:

    % Mail jon
    Subject: An experience in multiple personalities
    Hi I think I will try seeing if a second personality will respond to this
    if I get this ....

And if it is a short message, you can just enter it and end the mail with a ctrl-d.  And
even if it is long, you can just invoke an editor from most mailers.  But what if you 
just happened to have a message written up already, maybe one you wrote on your PC or
Mac at home and sent to your Unix account via modem or whatever.  Then you could do
something like this:

    % Mail -s "An experience in multiple personalities" jon < message.upload

You can even combine output and input redirection in one command.  Look at this:

    % <file.ps> file.ps.Z compress

Take a moment to think about what it does.


Piping
======
Okay, say you do this:

    % locate xpm
    [LOTS of output]
    [LOTS of output]
    [LOTS of output]
    [LOTS of output]
    <ctrl-c>

Okay, that may not have too helpful, all that output was unexpected (maybe) and probably
scrolled by way too fast to be read, so you save it to a file:

	% locate xpm > xpm.found
	[wait]

and read it with a pager:
	
	% less xpm.found

Well, in many shells, certainly Bourne and C shell at least, you can do this instead:

	% locate xpm.found | less

That bar there is called a "pipe" and using the pipe this way is called "piping" and
this whole command is called a "pipeline".  The output of one command is used as the
input of the next.  You can create long pipelines consisting of multiple pipes:

	% gzcat gcc.1.gz | nroff -man | sed -e 's,/usr/local/,/opt/local,' | less

Note that you can also pipe the output of a 'command group' -- a string of commands 
enclosed in parentheses -- into another command, just as you can redirect the output of
such a command group to a file:

	% (date; uptime; make CC=gcc -j4) | tee MAKE.out | less

Here, I used a command called "tee" that can be used to send output to two places, in
this case the file ``less'', and to standard output for piping to less.


Standard Output
===============

In Unix, there are at least a couple of flavors of output.  The two main varieties are
standard output and standard error.  Standard output is what you would normally expect
from a command.  Standard error is best described as diagnostic output.  For example:

	% nslookup franklin.cs.berkeley.edu > franklin.DNS

and in that file, franklin.DNS, you will find the DNS info for franklin.cs .
But if we try this next one, the result is unexpected:

	% nslookup bogus.address.net > bogus.DNS
	*** vangogh.CS.Berkeley.EDU can't find bogus: Non-existent domain

The error message is not redirected like the output we have seen.  Unix separates the 
two and if you do not define where to put each, they will both go to your terminal.
Bourne shell allows you to treat each one independently, allowing output and error to
be sent to two different places without too much trouble.  C Shell, when told to redirect
standard error, also redirects standard output to the same place.  So while you can do
this in Bourne shell:

	$ nslookup franklin.cs 2> /dev/null 1> franklin.DNS

You would be challenged to do the same in C shell as redirecting standard error would
send the standard output to the same place:

	% nslookup franklin.cs >& /dev/null

That would send EVERYTHING to /dev/null.  There are ways around it, usually cumbersome.
Some consider this part (and others) of C shell to be a bug, others a design flaw,
and yet others an intended feature depending on who you ask.  If you have any truly
bad flames, send them to Bill Joy, via anyone in the CS Dept Faculty.


Job Control and Processes
=========================

Background processes:

Remember this command?

	% locate xpm > xpm.found
	[wait wait wait wait wait]

What if you didn't want to wait for that to finish?  Well, if you know that something
that you are about to do is going to take a long time, you can try this:

	% locate xpm > xpm.found &
	[1] 12872

The ampersand ``&'' will put the ``job'' in the background.  That way, after you hit
enter, the shell will run the job in the background and return you to the prompt without
making you wait for the job to finish.  You have to be careful of one thing though; make
sure you know where the output is going to go:

	% make &
	[1] 13442
	% start_typing_command [LOTS of output from make]
	< realization that something is amiss >

So be sure to direct any output to someplace where it won't get in the way.

	% make >& MAKE.out
	% 

Job Control:

But say you start a job and only then realize after the fact that it's going to take
a long time.  You could interrupt it with ctrl-c and start it in the background from
the start, but then you would lose the work that had been done on that job so far.

When the C Shell first came out from Berkeley, one of its main features was something
called ``job control,'' something that allows you, the user, to have better control
over your jobs.  An example is in order:

	% swwlocate .tex | grep -v text > tex.files
	[ hmm, this is taking far longer than I care to wait ]
	< ctrl-z >
	Suspended
	% bg
	[1]    swwlocate .tex | grep -v text > tex.files &
	% [do some more stuff]
	[1]    Done                          swwlocate .tex | grep -v text > tex.files
	
The ^Z character will stop or suspend the job that you are currently running in the
foreground and give you the prompt.  The ``bg'' command will then take the most recently
stopped job and put it in the background just as if -- almost -- it were started in the
background from the beginning.  The ``fg'' command will foreground the most recently
backgrounded or stopped job.  Here is one way to use it:

	% trn
	[reading news ...]
	Message from Talk_Daemon@godzilla.EECS.Berkeley.EDU at 2:58 ...    
	talk: connection requested by jon@godzilla-134.EECS.Berkeley.EDU.  
	talk: respond with:  talk jon@godzilla-134.EECS.Berkeley.EDU       
	^Z
	Suspended
	% talk jon@godzilla-134.EECS.Berkeley.EDU
	[talk for a while ...]
	% fg
	[continue reading news where you left off]

Now, you can have many jobs in the suspended or running in the background -- up to 
certain limits of course -- and job control can help you manage all those jobs.  Lets
say I started a lot of jobs and wanted to see what I had going on behind my back:

	% jobs
	[1]    Suspended                     ftp ftp.cs
	[2]  - Suspended                     vi
	[3]  + Suspended                     Mail jon@csua
        [4]    Running                       make CC=gcc272 >& MAKE.out
        [5]    Suspended                     folders -rec | less
        [6]    Suspended                     nroff -man ./man/man1/cool.1 | less

Here, you see I have six jobs going, most of them suspended.  The number in the left hand
column is the job number, by which you can refer to the job when using any of the job
control commands like bg, fg, kill, and stop.  Kill does just that, it kills the job that
you tell it to.  Stop will stop a job running in the background.  Here are some examples:

	% jobs
        [1]    Suspended                     ftp ftp.cs
        [2]  - Suspended                     vi
        [3]  + Suspended                     Mail jon@csua
        [4]    Running                       make CC=gcc272 >& MAKE.out
        [5]    Suspended                     folders -rec | less
        [6]    Suspended                     nroff -man ./man/man1/cool.1 | less
	% kill %5
	[5]    Terminated                    folders -rec | less
	% stop %4
	[4]  + Suspended (signal)            swwlocate tex > tex.files
	% bg %1
	[1]    swwlocate tex > tex.files &
	% fg %3
	[continue writing mail to myself ...]

Also, instead of using %1 or %3, you can also use %ftp or %folders or even %?man
to refer to jobs.  The last three refer to jobs 1, 5, and 6 respectively.  And % or %%
refers to the current job.  And there are even shortcuts for the bg and fg commands by
which you can just use the job specifier to do all that typing.

	fg %		fg %3		fg %ftp		fg %?man
	%		%3		%ftp		%?man

	bg %		bg %3		bg %ftp		bg %?man
	%&		%3&		%ftp&		%?man&

Now, if you have jobs stopped that are not in the background and you try to logout, you
will get something like the following

	% logout
	There are stopped jobs
	%
	
You have a few options.  You can ask to logout again, and the shell will exit and kill
any stopped or suspended jobs still there.  Using ctrl-d will twice in a row will do the
same thing unless you do something to prevent ctrl-d from doing that.  More on that
later.  But that could kill things like editors, mail, and all sort of things you might
not want to kill just like that.

You could deal with each stopped job one by one.  You can actually do this a number of
ways.  You can fg each stopped job and finish it up -- save files, send mail, post
articles, quit ftp clients, ... -- and then exit.  Or you can push certain jobs into the
background if appropriate.  In most shells, jobs that are running in the background can
be left alone to finish even if you logout.  So there is no need to wait at your 
workstation while your huge project builds over the length of an hour or more.  So long
as you can expect it to build nicely, you can theoretically logout, get dinner, and come
back again to log in and see how things went.

And sometimes things go very amok.  You might have started something that goes very
amok -- whether by accident or by surprise is no matter -- and you want to stop it.
You don't always have luxury of being able to use job control.  But you can use kill
even in the lack of job control.  Some shells, like /bin/sh, don't have job control
yet the need to be able to kill things still exists.  For that purpose, /bin/kill exists.
It is a command that kills processes, which is slightly more specific than jobs.  And
to kill a process, you need what is known as its process id or PID.  To get that, you
use the ``ps'' command:

	% ps -eaf | grep crazy_run_away_process
	csua-lib 17666 17628  2 04:00:34 ttyt2    0:00 crazy_run_away_process
	csua-lib 18423 17628  2 04:00:34 ttyt2    0:00 grep crazy_run_away_process
	% kill 17666
	
Here, we used ps piped through grep to find the offending process and its PID and then
used ``kill'' to kill it.  Most other shells have a kill built into the shell that has
the same functionality.  The standalone command /bin/kill exists mainly for shells, like
/bin/sh, that do not have a builtin kill command.  If you read the man page for kill,
you will see that you can kill processes with certain ``signals'' such as TERM, KILL,
HUP, INT, and a whole bunch more.  The kill command, by default, sends the TERM signal.
But some processes will ``trap'' that signal and ignore it.  Shells often do this.  You
can kill those processes with the KILL signal.  Under few circumstances should most 
processes be able to ignore the KILL signal and when that happens, usually something else
far more wrong is happening.  You can use the signals that you use with /bin/kill with
the shell's builtin kill if it exists.

Command History:
===============

Okay, ever find that you end up typing the same commands over and over again?  Especially
the long, hard-to-type, easy-to-make-a-typing-mistake-while-typing ones?  C shell and
others, like Korn Shell, have something called command history where the shell keeps
track of what you have done and lets you reuse those commands.  You can see that list
with the ``history'' command.

	% history | tail -10
   	103  8:14    telnet pasteur.eecs 25
   	104  8:14    telnet langmuir.eecs 25
   	105  8:15    comp
   	106  8:15    history
   	107  8:15    finger root@godzilla.eecs
   	108  8:15    rsh meeko.eeecs
   	109  8:15    rsh meeko.eecs
   	110  8:15    resh meeko.eecs
   	111  8:16    ls
   	112  8:16    history
	
Here are the last 10 commands that I issued, including all my typing mistakes.  Now,
by default, the shell normally doesn't keep track of this many commands, just the
most recent one.  But you can change that with the history shell variable.

	% set history=<some number>

And starting from there, the shell will keep track of the N commmands.  So how do you
use this?  Well, you can repeat commands using a type of shorthand.  You can repeat
whole commands like so:

	% !107
	finger root@godzilla.eecs
	[godzilla.eecs.Berkeley.EDU]
	finger: connect: Connection refused
	% !?meeko
	resh meeko.eecs
	Password:
	[kill that rlogin ...]

But even if history is not set, you always have access to the previous command and can
repeat it with ``!!''.  You can also reuse parts of commands over and over.  

	% last jon
	jon       ttyAG/ADGy Thu Oct 17 02:40   still logged in
	^C
	% w !$
	w jon
	USER     TTY     WHERE                  LOGIN@  IDLE  WHAT
	jon      AG/ADGy burger.CSUA.Berkeley. 02:40AM    0s  nwrite psb 
	% Mail !$
	Mail jon
	....

There is also !* and !^.  The use of these will be clear as I use them in the Help
Session, otherwise refer to the little table here:

		!!      repeat the previous command
		!n      repeat command n
		!-n     repeat nth-to-last command
		!str    repeat last command starting with str
		!?str   repeat last command containg str

		!$      last argument of the previous command
		!^      first argument of the previous command
		!*      all arguments from previous command
		!!:n	nth argument from previous command
		note that the word or argument list start at zero(0)
		
You can also edit your commands.  C Shell provides a rudimentary way of editing commands
that can actually be deceptively powerful in that it maybe all you ever need.  But Tcsh
and Ksh go beyond that and provide a command line editor.  First, lets look at how 
C Shell does it.

	% telnet pastuer.eecs 25
	pastuer.eecs: unknown host
	% ^ue^eu
	telnet pasteur.eecs 25
	Trying...
	Connected to pasteur.eecs.Berkeley.EDU.
	[Blah Blah Blah ...]
	% echo !nroff
	nroff -man procmail.1 | more
	% !nroff:s/more/less
	nroff -man procmail.1 | less
	
And you can do some really cryptic stuff that you probably would never think of doing
unless you read the man page letter for letter.  The command line editor in tcsh and ksh
allows you to treat your history as if it were a file, with one command per line.
I won't get into it too much as there is so much to it, but at least in tcsh, you can
start by seeing what the ^N, ^P, ^B, ^F, ^A, and ^E keys do.  I will use them in session
and will explain what they do, but here, there is just so much to write about that I
simply can't do it justice and not make this handout much much longer.  You can also
learn more about the tcsh command line editor by looking at the output of the bindkey
command and by reading the man page, especially where it talks about emacs mode and
vi mode.

Helpful Goodies:
===============

Filename and Command completion and other File shortcuts:

C Shell has something called filename and command completion which mean that if you
give csh (or tcsh) a partial file or command name, you can have csh complete the rest
of the name so long as it is not ambiguous.  Csh requires that you set the filec
shell environment variable.  Tcsh just lets you do this automatically.  Csh uses the
the ESC key to complete filenames and commands while tcsh uses tab.

You can refer to home directories as ~username


Command aliases:

C shell (and a lot of other shells) has something called command aliases.  You can
take a long and/or often typed command and assign a name that if you type that name
as a command, it will execute that long command instead.  I used an example of that
with the "wg" command.  The "wg" command is actually an alias for the following:

	% which wg
	wg:      aliased to agrep -d '^Bored' !* /usr/local/csua/wall_log

I set that up with the alias command:

	% alias   wg              "agrep -d '^Bored' \!^ /usr/local/csua/wall_log"

The !^ refers the first argument of what ever I typed on the command line.  The slash
is there so that the shell doesn't immediately evaluate that as a history specifier.

Moving around and other shortcuts:

pushd and popd

cd -  (tcsh only)

the cdpath shell variable (csh and tcsh only so far as I know)

command substitution with the backtick or `

Fun stuff:

Shell variables:

ignoreeof, path, watch, prompt

Initialization files:

.cshrc, .login

stty