College of Engineering
EECS Instructional & Electronics Support Group
/usr/pub/Instructional_Plans
/usr/pub/reports/manager/Fall_1998
January 31, 1999
EECS Instructional Computing - Review and Plans
-----------------------------------------------
Fall 1998
References about Common IESG Services:
http://inst.eecs.berkeley.edu/
Instructional WEB server, links to class home pages, student home pages,
information about Instructional UNIX accounts, modem access, cardkey
access, computers and labs, software.
http://www-iesg.eecs.berkeley.edu/
Electronics Support WEB server, links to information about electronics
labs, AV services and Windows NT services.
Plans:
Prof Sequin plans to reclaim some Instructional labs in Soda for grad
student offices and upgrade other Instructional labs for upper division
labs such as multimedia authoring and 3D animation. The plan would be
implemented over a number of years. We are starting to search for
alternate labs on campus for CS Lower Division classes.
Profs Neureuther and Lee have requested that room 105 Cory be converted
to an electronics lab for use by a new course, EECS20, in Spring 1999.
Software Changes this summer:
- upgraded UNIX password service to NIS+ for improved performance and
security.
- upgraded all UNIX PCs to Solaris x86 v 2.6 with NFS3.
- upgraded HP-UX and DEC Alpha systems to NFS3.
- converted the Pasteur mail server from an old DEC system to a new
dual-processor PC running Solaris v 2.6.
- added IMAP and POP3 capability to Pasteur mail server.
- adapted Prof Hilfinger's versions of "submit" and "grade" for use by
any class on UNIX.
- reviewed and enhanced UNIX network security (o/s patches and upgrades)
Lab Changes this summer:
347 Soda: Instruction has vacated; formerly shared predominantly by
CS184 and CS152. The 14 HPs from 347 Soda, which are high-end graphics ]
systems, have been moved to 273 Soda. CS184 will continue to use 349 Soda;
CS152 may use the HPs in 273 Soda or new NT systems in 119 Cory.
310 Davis: 9 other HP workstations have been moved from the second
floor of Soda to 310 Davis, the CS61A lab, for a total of 38 systems.
105 Cory: Older HP workstations have been removed and the lab is
being converted into electronics lab for EECS20. EECS20 is a new course
this Fall, taught by Prof Lee, and will require computing for 300 students
in Spring 1999. 27 new HP Kayak PCs running NT are being installed. The
HP PCs were donated via a grant won by Prof Zakhor.
117 Cory: 16 of the HP UNIX workstations formerly in 105 Cory will be
moved to 117 Cory. 7 other HP UNIX workstations are in 199 Cory. These
UNIX systems will continue to meet the needs of EE classes that run SUPREM
and other UNIX applications.
353 Cory was renovated and dedicated in May 1998. Generous donations
from Rockwell International provided for new flooring, lab furniture,
electrical and network conduit and LCD overhead projection. The lab
will be used by several EE classes.
140 Cory and 143 Cory are being upgraded and repainted.
New Cardkey System In Cory:
New cardkey readers will be installed in Cory Hall in 1998-99. Alex Para
is the project manager. The third floor of Cory will be converted in the
Fall, and the first floor is scheduled to be converted in January 1999.
These will require a different type of cardkey that is used now, which
means that users who have acccess to both Soda and Cory Halls will need
2 different cardkeys. Users will be charged $10 ($5 refundable) for the
new Cory cardkeys.
For additional information, please see http://inst.eecs.berkeley.edu
or contact us.
Notable events this semester:
-----------------------------------------------------------------------
(Dec 22) workstations off; SERVERS remain on
We have shut down most workstations for the holiday, until
about Jan 11. These UNIX servers will remain up and available
for remote access ("telnet", "rlogin", "ssh"):
cory.eecs.berkeley.edu (DEC UNIX)
parker.eecs.berkeley.edu (HP-UX)
torus.cs.berkeley.edu (SOLARIS x86)
-----------------------------------------------------------------------
(Nov 25) Saidar down 4:15 - 5:20pm
symptom - home directories not accessible for named accounts,
ee class accounts
cause - saidar stopped exporting and we rebooted it
-----------------------------------------------------------------------
(Nov 25) - Franklin down midnight - 8:00a.m.
symptom - home directories not accessible for some named accounts,
CS162/CS164 class accounts
cause - franklin stopped exporting and we rebooted it
-----------------------------------------------------------------------
(Nov 20) - Saidar down 11:00 - 11:30
symptom - home directories not accessible for named accounts,
ee class accounts
cause - saidar stopped exporting and we rebooted it
-----------------------------------------------------------------------
(Nov 17) - Zip Drives on Solaris PCs are once again functional
Please read /usr/pub/Solaris.help
for more info on how to use the drives.
-----------------------------------------------------------------------
(Nov 15-16) - franklin down - 23:45 PM - 01:30 AM
symptom - unable to log in for upper-div. CS class & named accounts
cause - /home/ll failed to time out (i.e., same deal as usual)
-----------------------------------------------------------------------
(Nov 13) - cochise down 12:45 - 2:15 AM
symptom - unable to log in for class (cs61abc) accounts
cause - repeated kernel panics on cochise probably from disk
usage being more than what cochise can handle
-----------------------------------------------------------------------
(Nov 5) - Web Server downtime 9:00am-9:30am on Fri 11/6
Moving contents of web server to a new disk so that
we will avoid any further disk crunches.
-----------------------------------------------------------------------
(Nov 2) - Saidar crash, 11:30am-12:30pm
Symptom: home directories inaccessible, can't login.
Affected: named accounts, ee class accounts, DEC Alpha workstations
in 199 Cory
Cause: faulty memory, probably; we are working with the vendor to
identify the problemm.
Duration: about 11:30am-12:30pm
-----------------------------------------------------------------------
(Oct 26) - network failure in 117 Cory and 2 HP workstations in 199 Cory
Symptom: existing login sessions freeze up; new logins denied
Affected: login attempts to all HP workstations in 117 Cory and
2 HP workstations in 199 Cory
Cause: Failure of a network hub
Duration: about 1:30pm-3pm
-----------------------------------------------------------------------
(Oct 30) - modems will be down for maintenance on Mon Nov 2
The EECS Instructional modems (642-0070 and
642-6679) will be unavailable
during the day on Mon Nov, 2 while they are tested and repaired. We
have received many reports of problems in the last month.
Faulty modems in the modem pool have failed to answer; passwords have
not worked; connections have been lost soon after the modem answers.
Apparently the modem bank lost power altogether for several hours on
Wedsnesday Oct 28. We hope these problems will be corrected on Monday.
-----------------------------------------------------------------------
(Oct 27) - Pasteur (mail server) unresponsiveness
Symptom: can't check mail on any Instructional workstation
or via POP or IMAP; if you check mail on login, then
you can't login (unless you cancel the mail check).
some workstations refused logins completely as well
(pasteur is a backup NIS+ server.)
Affected: all EECS Instructional users
Cause: pasteur started hundreds of mail processes to attempt
to deliver the mail that was failing to get to saidar
-- each one caused the available CPU cycles to be
reduced to each process, and eventually the process
table was filled. eventually also pasteur ran out of
memory entirely.
Duration: 10:00PM - 11:00PM
-----------------------------------------------------------------------
(Oct 27) - Saidar crash from memory failure
Symptom: home directories inaccessible, can't login.
Affected: named accounts, ee class accounts, DEC Alpha
workstations in 199 Cory
Cause: saidar ran out of memory .... again
Duration: 7:45 to 10:45 PM
-----------------------------------------------------------------------
(Oct 26) - network failure in 117 Cory and 2 HP workstations in 199 Cory
Symptom: existing login sessions freeze up; new logins denied
Affected: login attempts to all HP workstations in 117 Cory and
2 HP workstations in 199 Cory
Cause: Failure of a network hub
Duration: about 1:30pm-3pm
-----------------------------------------------------------------------
(Oct 25) - franklin.cs was down twice on Sun Oct 25
Symptom: "no home directory"
Affected: CS162, CS164, users with home directories on /home/{hh,ii,jj,
kk,ll,mm,nn,pp}
Cause: Recurrent freezing of a disk on the differential SCSI bus.
Duration: 2:30am-12:30pm, 5:10pm-5:35pm
Response: We will replace Franklin with a new file server soon;
we have set up a PC for it and have ordered a new disk tower.
-----------------------------------------------------------------------
(Oct 21) - loss of power in Cory labs, loss of access to servers (3-4:15pm)
Symptoms:
all power died in 1st floor Cory labs
loss of access to MAIL server (pasteur.eecs)
loss of access to WEB servers (http:/www-inst.eecs, http:/iesg.eecs)
loss of access to UNIX servers (cory.eecs, parker.eecs, po.eecs,
saidar.eecs)
loss of access to NT servers (\\fischer, \\ntsww)
no UNIX and NT home directories for many users
Affected: most Instructional UNIX and NT accounts
Cause: power failure in Cory Hall first floor
-----------------------------------------------------------------------
(Oct 11) - cochise.cs crashed (4-4:30pm)
Symptoms:
no home directories, active logins froze up
loss of access to /usr/sww for Instructional HP UNIX systems in Soda
and Davis Halls
Affected: CS61A, CS61B and others
Cause: Cochise.eecs file server crashed, perhaps due to excess NFS
activity.
Response: We are working to replace the old NFS servers with a new
server using a RAID disk array by Jan 1999.
-----------------------------------------------------------------------
(Oct 11) - cochise.cs was down Sun Oct. 11 2:15 - 4:15 PM
Symptom: no home directory (class accounts unavailable)
can't run emacs/java/netscape on HPs (/usr/sww on HPs unavailable)
Affected: CS61A,B,C, CS184, CS186, other users on
/home/{aa,bb,cc,dd,ee,ff,gg,qq,rr}
Cause: Same as franklin's recurrent disk bug, this time
affecting /home/bb
A bunch of HP workstations could not access /usr/sww on cochise
after the left-hand disk tower was power-cycled; after these
workstations were rebooted, everything seemed fine again.
--brg@cory.eecs
-----------------------------------------------------------------------
(Sep 26) - franklin.cs was down Sat Sep 26, 10:15-10:45 am, 11am-noon
(Sep 24) - franklin.cs was down Thu Sep 24, 12:15pm-1:30pm.
Symptom: "no home directory"
Affected: CS162, CS164, users with home directories on /home/{hh,ii,jj,
kk,ll,mm,nn,pp}
Cause: Recurrent freezing of a disk on the differential SCSI bus.
History: It's not always the same disk, but it causes the entire
bus to freeze. It started happening last spring and has increased in
frequency. It is probably load-related (didn't happen over the summer).
Last spring, we asked HP tech support to look at it. They did not
diagnose it; the only course seems to be to replace parts until it
stops. We replaced the SCSI controller (the entire motherboard in
fact) on Sep 15. Later, we suspected one disk and took it out of
service. Nevertheless, the problem keeps happening.
Next, we may focus our attention on the 2 disk expansion towers and
their power supplies. In parallel, we will work to replace the
entire file server. (kevinm@eecs)
-----------------------------------------------------------------------
(Sep 22) - Saidar.eecs was down Mon Sep 21, 12:15pm-4:30pm.
Symptom: "no home directory" when you login; "command not found" errors
Affected: users with home directories on /home/{b,c,d,e,f};
computers that use /share/b from Saidar (UNIX computers in 199 & 117 Cory,
cory.eecs, parker.eecs)
Downtime: Mon Sep 21, 12:15pm-4:30pm
Cause: The problem was the apparent failure on saidar.eecs
of all 5 disks on one channel of the RAID controller. First we
investigated possible hardware failures. In fact, the cause seems
to have been a transient software problem. It seems that the RAID
controller detected a problem on the channel and shut off all the
disks. We were able to restore the disks by resetting the controller
and running parity checks on the RAID logical drives (each is a
collection of disks that look like one big disk to the operating
system). The RAID parity checks and subsequent UNIX file checks took
about 45 minutes for each of the 3 RAID logical drives.
-----------------------------------------------------------------------
(Sep 22) - How to access the EECS modems and
home IP
1) Get a password for the modems by using the on-line service
'
telnet home-ip.berkeley.edu'
2) Students enrolled in EECS classes automatically have dialin
access to
642-6679, 642-0070, 643-9600
3) See http://inst.eecs.berkeley.edu/modems.html for more info.
-----------------------------------------------------------------------
(Sep 22) - UNIX email and restrictions on "pasteur" POP server
In June 1998, new security restrictions were implemented on the
pasteur.eecs mail server. Here are some restrictions and usage
tips when using POP or IMAP:
What's a mail server?
"pasteur.eecs.berkeley.edu" is the EECS Instructional email
server. On the Instructional UNIX systems, programs such as
"pine" and "mailx" access Pasteur directly. On NT and Mac
systems, programs such as "Eudora" can read your email from
Pasteur using the POP or IMAP protocols. In Eudora, you
enter "pasteur.eecs.berkeley.edu" as your mail host and
enter your UNIX account name as user name. You can also
enter anything you want in Eudora as your own computer name,
which is where replies to your outgoing mail are sent. See
below for restrictions on that.
Setting your computer name:
Pasteur requires that you configure your own computer name to
be an EECS or CS computer, such as "cory.eecs.berkeley.edu".
You do this in your POP client (Eudora, etc), so that the
"From" line in your outgoing mail says that you are using an
EECS or CS computer.
Otherwise, our POP server will reject your requests to connect
to it and download your mail. This is an added security
feature.
Your .forward file:
...may not work like it used to. Specific problems:
1. The following filter programs are the only ones
supported: filter, procmail, slocal and vacation. If
there is something you need to run, contact root@cory
2. .forward files may not be symbolic links
3. If you use procmail and your .forward file looked
like:
"|IFS=' '&&exec /usr/sww/bin/procmail -f-||exit 75 #login"
it needs to be changed to:
"|/usr/sww/bin/procmail -f- #login"
4. '||', '$' and '&&' are no longer valid in .forward files
For more information, please see /usr/pub/email.help,
send email to root@cory.eecs.berkeley.edu
or visit 384/386 Cory or 333 Soda.
-----------------------------------------------------------------------
(Sep 16) - Franklin.cs filesystems were down, until about 9:45am
(Sep 16) - Pasteur.eecs (mail server) was down, until about 9:30am
The recurrent problem on franklin.cs also affected pasteur.eecs this
morning, while pasteur waited for disk access to franklin.
We will be replacing the disk controller and motherboard on franklin
this week.
-----------------------------------------------------------------------
(Sep 12) - Franklin.cs filesystems were down, 5:00pm-6:45pm (Saturday)
(Sep 06) - Franklin.cs filesystems were down, 4:15pm-9:45pm (Sunday)
The server franklin.cs stopped exporting its filesystems; this has
been a recurring problem that has been impossible to diagnose. We
will replace disk controllers soon in an effort to resolve it.
-----------------------------------------------------------------------
(Sep 05) - Cory.eecs was down, 1:10am-1:45am
09/05 CORY.EECS (includes http://inst.eecs) was down from about
1:10am-1:45am for installation a larger swap disk.
-----------------------------------------------------------------------
(Aug 27) - Franklin.cs was down, 10:45pm-11pm
Franklin.cs was rebooted to clear a problem that prevented
its file systems from being exported.
-----------------------------------------------------------------------
(Aug 03) - Cochise downtime 4:30-5:30pm Tue Aug 4
Cochise will be down from 4:30 until about 5:30 on Tuesday, Aug. 4
to replace a bad disk drive.
-----------------------------------------------------------------------
(Jul 21) - Cory.eecs changes into an OSF1/Alpha system on Mon July 27
On Monday July 27 at about noon, the computer name "cory.eecs" will
be changed from the current Ultrix operating system running on a
computer with a MIPS processor to a DEC UNIX operating system (also
called "OSF1") running on a computer with a DEC Alpha processor.
"Cory.eecs" will become the same computer as "saidin.eecs".
Note that Ultrix binaries do not run on OSF/Alpha systems, so this
will affect any programs you have compiled for Ultrix only. We will
continue to support the Ultrix operating system on "volga.eecs".
Please email root@cory.eecs.berkeley.edu if this creates any problems.
-----------------------------------------------------------------------
(Jul 15) - Po.eecs downtime and password service changes July 16/17
PO.EECS: new computer
---------------------
Po.eecs will become a new computer at about noon on Fri Jul 17.
Po.eecs has the master password file, and you have been told to
login there to change your password. Po.eecs is now a DEC Ultrix
system; it will become a Solaris X86 system. The benefits are:
faster computer, new password server software.
Your current 'login' password will still be valid after this change.
On Fri July 17, Po.eecs may be down (off the net) at times.
From Thu July 16 - Mon July 20, we may prevent any password changes.
The New Password Service
------------------------
The new password software is called "NIS+". It will allow users at
the Solaris X86 PCs to change their passwords without logging in to
po.eecs. Users on the HP, DEC and SGI systems will still have to
login to po.eecs. But in all cases, you won't get "password file is
busy" messages any more, and the changes will take effect within 5-10
minutes instead of 40-60 minutes. NIS+ has better efficiency and
security features than our current password service.
For more technical info about NIS+, please type "man nis+" on any of
our Solaris systems (for lists, please see
http://inst.eecs.berkeley.edu/clients).
YOU'LL HAVE A SECOND PASSWORD:
-----------------------------
In addition to the 'login' password that you now use, NIS+ uses a
second 'secure RPC' (also called 'secret key' and 'NIS+ credential')
password. The default 'secure RPC' password for all users is "nisplus".
When logging into one of
our Solaris X86 computers, you may see a warning message such as
This password differs from your secure RPC password.
or
Password does not decrypt secret key for unix.291@Inst.nisplus.
This is not a problem, but it will be an advantage to you to make your
'secure RPC' password be the same as your 'login' password. You can
do that by typing logging into "po.eecs" or "torus.cs" and runing the
Solaris X86 command "chkey". For example:
% chkey -p
Updating nisplus publickey database.
Generating new key for 'unix.3232@Inst.nisplus'.
Please enter the Secure-RPC password for jdoe: nisplus
Please enter the login password for jdoe: {jdoe's password}
This sets the 'secure RPC' password to match the 'login' password,
so you won't have to type it when you change your password, shell or
'finger' information on a Solaris X86 system. We plan to install
"wrapper" programs for the UNIX 'passwd', 'chsh and 'chfn' programs
to automate the entry of the 'secure RPC' password, but you may still
see messages about it when you use these programs. Please notify
'root@cory.eecs'
if you have difficulty using these new programs.
Use 'ssh' for better security:
-----------------------------
Users on other UNIX computers will still need to run those commands
on po.eecs. For added security, we recommend that you login into
po.eecs using the "ssh" program rather than rlogin or telnet.
"Ssh" is available on our UNIX computers and is used like "rlogin":
ssh po.eecs -l {your_login}
"Ssh" is commercial software that unfortunately is not available for
free. It can be purchased for PCs and Macs: please see
http://inst.eecs.berkeley.edu/usr/pub/ssh.help for details.
-----------------------------------------------------------------------
(Jul 15) - /home/tmp on po.eecs will be unavailable Thu/Fri, July 16/17
/home/tmp on po.eecs will be unavailable from about 1pm Thu Jul 16
until 1pm on Fri Jul 17, while we move it to the new po.eecs.
-----------------------------------------------------------------------
(July 01) - Saidar /home/d will be inaccessible from 8am-9:30am on July 1
/home/d will undergo full dumps between 8AM and 9:30AM on July 1.
During this time the filesystem will not be mounted by any of our
client systems. This was work that was not completed last Friday
as originally scheduled. We apologize for the abrupt notice.
-----------------------------------------------------------------------
Kevin Mullally, Manager Ferenc Kovac, Associate Manager
EECS Instructional & Electronics EECS Instructional & Electronics
378 Cory Hall, (510) 643-6141 377 Cory Hall, (510) 642-6952
kevinm@eecs.berkeley.edu ferenc@eecs.berkeley.edu
source: /usr/pub/reports/manager/Fall.1998 - revised January 31, 1999