EECS Instructional Support, University of California at Berkeley
[ ISG home page ] [ who we are ] [ send us email ] [ search ] [ FAQ ]

                        College of Engineering
             EECS Instructional & Electronics Support Group

January 31, 1999

              EECS Instructional Computing - Review and Plans
		               Fall 1998

References about Common IESG Services:
  Instructional WEB server, links to class home pages, student home pages, 
  information about Instructional UNIX accounts, modem access, cardkey 
  access, computers and labs, software.
  Electronics Support WEB server, links to information about electronics
  labs, AV services and Windows NT services.

  Prof Sequin plans to reclaim some Instructional labs in Soda for grad 
  student offices and upgrade other Instructional labs for upper division 
  labs such as multimedia authoring and 3D animation.  The plan would be 
  implemented over a number of years.   We are starting to search for
  alternate labs on campus for CS Lower Division classes.
  Profs Neureuther and Lee have requested that room 105 Cory be converted
  to an electronics lab for use by a new course, EECS20, in Spring 1999.
Software Changes this summer:
    - upgraded UNIX password service to NIS+ for improved performance and 
    - upgraded all UNIX PCs to Solaris x86 v 2.6 with NFS3.
    - upgraded HP-UX and DEC Alpha systems to NFS3.
    - converted the Pasteur mail server from an old DEC system to a new 
      dual-processor PC running Solaris v 2.6.
    - added IMAP and POP3 capability to Pasteur mail server. 
    - adapted Prof Hilfinger's versions of "submit" and "grade" for use by 
      any class on UNIX.
    - reviewed and enhanced UNIX network security (o/s patches and upgrades)
Lab Changes this summer:

  347 Soda: Instruction has vacated; formerly shared predominantly by 
  CS184 and CS152.  The 14 HPs from 347 Soda, which are high-end graphics ]
  systems, have been moved to 273 Soda.  CS184 will continue to use 349 Soda; 
  CS152 may use the HPs in 273 Soda or new NT systems in 119 Cory.
  310 Davis: 9 other HP workstations have been moved from the second 
  floor of Soda to 310 Davis, the CS61A lab, for a total of 38 systems.
  105 Cory:  Older HP workstations have been removed and the lab is
  being converted into electronics lab for EECS20.  EECS20 is a new course 
  this Fall, taught by Prof Lee, and will require computing for 300 students 
  in Spring 1999.   27 new HP Kayak PCs running NT are being installed.  The 
  HP PCs were donated via a grant won by Prof Zakhor. 
  117 Cory: 16 of the HP UNIX workstations formerly in 105 Cory will be 
  moved to 117 Cory.   7 other HP UNIX workstations are in 199 Cory.  These
  UNIX systems will continue to meet the needs of EE classes that run SUPREM 
  and other UNIX applications.
  353 Cory was renovated and dedicated in May 1998.  Generous donations 
  from Rockwell International provided for new flooring, lab furniture,
  electrical and network conduit and LCD overhead projection.  The lab 
  will be used by several EE classes.
  140 Cory and 143 Cory are being upgraded and repainted.
New Cardkey System In Cory:
  New cardkey readers will be installed in Cory Hall in 1998-99.  Alex Para
  is the project manager.  The third floor of Cory will be converted in the 
  Fall, and the first floor is scheduled to be converted in January 1999.
  These will require a different type of cardkey that is used now, which 
  means that users who have acccess to both Soda and Cory Halls will need
  2 different cardkeys.   Users will be charged $10 ($5 refundable) for the 
  new Cory cardkeys.
For additional information, please see
  or contact us.

Notable events this semester:
  (Dec 22) workstations off; SERVERS remain on

	We have shut down most workstations for the holiday, until 
	about Jan 11. These UNIX servers will remain up and available 
	for remote access ("telnet", "rlogin", "ssh"):   (DEC UNIX) (HP-UX)  (SOLARIS x86)
  (Nov 25) Saidar down 4:15 - 5:20pm

	symptom - home directories not accessible for named accounts, 
		  ee class accounts
        cause - saidar stopped exporting and we rebooted it
  (Nov 25) - Franklin down midnight - 8:00a.m.

	symptom - home directories not accessible for some named accounts, 
		  CS162/CS164 class accounts
        cause - franklin stopped exporting and we rebooted it
  (Nov 20) - Saidar down 11:00 - 11:30

	symptom - home directories not accessible for named accounts, 
		  ee class accounts
        cause - saidar stopped exporting and we rebooted it

  (Nov 17) - Zip Drives on Solaris PCs are once again functional

	Please read <a href="/share/b/pub/">/share/b/pub/</a>
	for more info on how to use the drives.
  (Nov 15-16) - franklin down - 23:45 PM - 01:30 AM

	symptom - unable to log in for upper-div. CS class & named accounts
	cause - /home/ll failed to time out (i.e., same deal as usual)
  (Nov 13) - cochise down 12:45 - 2:15 AM

        symptom - unable to log in for class (cs61abc) accounts
        cause - repeated kernel panics on cochise probably from disk
                usage being more than what cochise can handle
  (Nov 5) - Web Server downtime 9:00am-9:30am on Fri 11/6

	Moving contents of web server to a new disk so that
        we will avoid any further disk crunches.
  (Nov 2) - Saidar crash, 11:30am-12:30pm

	Symptom: home directories inaccessible, can't login.
	Affected: named accounts, ee class accounts, DEC Alpha workstations 
	in 199 Cory
	Cause: faulty memory, probably; we are working with the vendor to 
	identify the problemm.
	Duration: about 11:30am-12:30pm
  (Oct 26) - network failure in 117 Cory and 2 HP workstations in 199 Cory

	Symptom: existing login sessions freeze up; new logins denied
        Affected: login attempts to all HP workstations in 117 Cory and 
	2 HP workstations in 199 Cory
	Cause: Failure of a network hub 
	Duration: about 1:30pm-3pm
  (Oct 30) - modems will be down for maintenance on Mon Nov 2

	The EECS Instructional modems (<a href="modems.html">642-0070 and 
	642-6679</a>) will be unavailable 
	during the day on Mon Nov, 2 while they are tested and repaired.  We 
	have received many reports of problems in the last month.  

	Faulty modems in the modem pool have failed to answer; passwords have
	not worked; connections have been lost soon after the modem answers.
	Apparently the modem bank lost power altogether for several hours on 
	Wedsnesday Oct 28.  We hope these problems will be corrected on Monday.
  (Oct 27) - Pasteur (mail server) unresponsiveness

	Symptom: can't check mail on any Instructional workstation
			or via POP or IMAP; if you check mail on login, then
			you can't login (unless you cancel the mail check).
			some workstations refused logins completely as well
			(pasteur is a backup NIS+ server.)
	Affected: all EECS Instructional users
	Cause: pasteur started hundreds of mail processes to attempt
	              to deliver the mail that was failing to get to saidar
		      -- each one caused the available CPU cycles to be
		      reduced to each process, and eventually the process
		      table was filled. eventually also pasteur ran out of
		      memory entirely.
	Duration: 10:00PM - 11:00PM
  (Oct 27) - Saidar crash from memory failure

	Symptom: home directories inaccessible, can't login.
	Affected: named accounts, ee class accounts, DEC Alpha
			 workstations in 199 Cory
	Cause: saidar ran out of memory .... again
	Duration: 7:45 to 10:45 PM
  (Oct 26) - network failure in 117 Cory and 2 HP workstations in 199 Cory

	Symptom: existing login sessions freeze up; new logins denied
        Affected: login attempts to all HP workstations in 117 Cory and 
	2 HP workstations in 199 Cory
	Cause: Failure of a network hub 
	Duration: about 1:30pm-3pm
  (Oct 25) - franklin.cs was down twice on Sun Oct 25

	Symptom: "no home directory" 
        Affected:  CS162, CS164, users with home directories on /home/{hh,ii,jj,
	Cause:  Recurrent freezing of a disk on the differential SCSI bus.
	Duration: 2:30am-12:30pm, 5:10pm-5:35pm
	Response: We will replace Franklin with a new file server soon;
	we have set up a PC for it and have ordered a new disk tower. 
  (Oct 21) - loss of power in Cory labs, loss of access to servers (3-4:15pm)

	all power died in 1st floor Cory labs
	loss of access to MAIL server (pasteur.eecs)
	loss of access to WEB servers (http:/www-inst.eecs, http:/iesg.eecs)
	loss of access to UNIX servers (cory.eecs, parker.eecs, po.eecs, 
	loss of access to NT servers (\\fischer, \\ntsww)
	no UNIX and NT home directories for many users
        Affected: most Instructional UNIX and NT accounts
        Cause: power failure in Cory Hall first floor
  (Oct 11) - cochise.cs crashed (4-4:30pm)

	no home directories, active logins froze up 
	loss of access to /usr/sww for Instructional HP UNIX systems in Soda 
	and Davis Halls
        Affected: CS61A, CS61B and others
        Cause: Cochise.eecs file server crashed, perhaps due to excess NFS
	Response: We are working to replace the old NFS servers with a new 
	server using a RAID disk array by Jan 1999.
  (Oct 11) - cochise.cs was down Sun Oct. 11 2:15 - 4:15 PM

        Symptom: no home directory (class accounts unavailable)
	can't run emacs/java/netscape on HPs (/usr/sww on HPs unavailable)
        Affected: CS61A,B,C, CS184, CS186, other users on
        Cause: Same as franklin's recurrent disk bug, this time
	affecting /home/bb

	A bunch of HP workstations could not access /usr/sww on cochise
	after the left-hand disk tower was power-cycled; after these
	workstations were rebooted, everything seemed fine again.
  (Sep 26) - franklin.cs was down Sat Sep 26, 10:15-10:45 am, 11am-noon
  (Sep 24) - franklin.cs was down Thu Sep 24, 12:15pm-1:30pm.

	Symptom: "no home directory" 
        Affected:  CS162, CS164, users with home directories on /home/{hh,ii,jj,
	Cause:  Recurrent freezing of a disk on the differential SCSI bus.
	History: It's not always the same disk, but it causes the entire 
	bus to freeze.  It started happening last spring and has increased in 
	frequency.  It is probably load-related (didn't happen over the summer).  
	Last spring, we asked HP tech support to look at it. They did not 
	diagnose it; the only course seems to be to replace parts until it 
	stops.  We replaced the SCSI controller (the entire motherboard in 
	fact) on Sep 15.   Later, we suspected one disk and took it out of 
	service.  Nevertheless, the problem keeps happening.

	Next, we may focus our attention on the 2 disk expansion towers and 
	their power supplies.   In parallel, we will work to replace the
	entire file server.  (kevinm@eecs)
  (Sep 22) - Saidar.eecs was down Mon Sep 21, 12:15pm-4:30pm.

	Symptom: "no home directory" when you login; "command not found" errors
	Affected: users with home directories on /home/{b,c,d,e,f}; 
	computers that use /share/b from Saidar (UNIX computers in 199 & 117 Cory, 
	cory.eecs, parker.eecs)
	Downtime: Mon Sep 21, 12:15pm-4:30pm
	Cause: The problem was the apparent failure on saidar.eecs 
	of all 5 disks on one channel of the RAID controller.  First we 
	investigated possible hardware failures.  In fact, the cause seems 
	to have been a transient software problem.  It seems that the RAID 
	controller detected a problem on the channel and shut off all the 
	disks.  We were able to restore the disks by resetting the controller 
	and running parity checks on the RAID logical drives (each is a 
	collection of disks that look like one big disk to the operating 
	system).  The RAID parity checks and subsequent UNIX file checks took 
	about 45 minutes for each of the 3 RAID logical drives.
  (Sep 22)  - How to access the <A HREF="modems.html">EECS modems and 
 	 home IP</A>

	1) Get a password for the modems by using the on-line service
	'<A HREF="">

	2) Students enrolled in EECS classes automatically have dialin
access to <A HREF="">
	642-6679, 642-0070, 643-9600</A>

3) See <A HREF=""></A> for more info.
  (Sep 22) - UNIX email and restrictions on "pasteur" POP server

	In June 1998, new security restrictions were implemented on the 
	pasteur.eecs mail server.  Here are some restrictions and usage 
	tips when using POP or IMAP:

	What's a mail server?

		"" is the EECS Instructional email 
		server.  On the Instructional UNIX systems, programs such as 
		"pine" and "mailx" access Pasteur directly.   On NT and Mac 
		systems, programs such as "Eudora" can read your email from 
		Pasteur using the POP or IMAP protocols.  In Eudora, you 
		enter "" as your mail host and 
		enter your UNIX account name as user name.  You can also 
		enter anything you want in Eudora as your own computer name,
		which is where replies to your outgoing mail are sent.  See 
		below for restrictions on that.

	Setting your computer name:

		Pasteur requires that you configure your own computer name to 
		be an EECS or CS computer, such as "".
		You do this in your POP client (Eudora, etc), so that the 
		"From" line in your outgoing mail says that you are using an 
		EECS or CS computer.

		Otherwise, our POP server will reject your requests to connect
		to it and download your mail.  This is an added security 
	Your .forward file:

		...may not work like it used to.  Specific problems:

		1.  The following filter programs are the only ones 
		supported: filter, procmail, slocal and vacation.  If 
		there is something you need to run, contact root@cory

		2. .forward files may not be symbolic links

		3.  If you use procmail and your .forward file looked 

		"|IFS=' '&&exec /usr/sww/bin/procmail -f-||exit 75 #login"

		it needs to be changed to:

		"|/usr/sww/bin/procmail -f- #login"

		4.  '||', '$' and '&&' are no longer valid in .forward files

	For more information, please see <A HREF="/share/b/pub/">/share/b/pub/</A>,
	send email to <A HREF=""><i></i></a>
	or visit 384/386 Cory or 333 Soda.
  (Sep 16) - Franklin.cs filesystems were down, until about 9:45am 
  (Sep 16) - Pasteur.eecs (mail server) was  down, until about 9:30am

        The recurrent problem on franklin.cs also affected pasteur.eecs this
	morning, while pasteur waited for disk access to franklin.   

	We will be replacing the disk controller and motherboard on franklin
	this week.
  (Sep 12) - Franklin.cs filesystems were down, 5:00pm-6:45pm (Saturday)
  (Sep 06) - Franklin.cs filesystems were down, 4:15pm-9:45pm (Sunday)

        The server franklin.cs stopped exporting its filesystems; this has
	been a recurring problem that has been impossible to diagnose.  We
	will replace disk controllers soon in an effort to resolve it.
  (Sep 05) -  Cory.eecs was down, 1:10am-1:45am</B>

09/05 CORY.EECS (includes http://inst.eecs) was down from about
		1:10am-1:45am for installation a larger swap disk.
  (Aug 27) -  Franklin.cs was down, 10:45pm-11pm</B>

	Franklin.cs was rebooted to clear a problem that prevented
	its file systems from being exported.
  (Aug 03) - Cochise downtime 4:30-5:30pm Tue Aug 4

	Cochise will be down from 4:30 until about 5:30 on Tuesday, Aug. 4
	to replace a bad disk drive.
  (Jul 21) - Cory.eecs changes into an OSF1/Alpha system on Mon July 27

	On Monday July 27 at about noon, the computer name "cory.eecs" will 
        be changed from the current Ultrix operating system running on a
	computer with a MIPS processor to a DEC UNIX operating system (also 
	called "OSF1") running on a computer with a DEC Alpha processor.
        "Cory.eecs" will become the same computer as "saidin.eecs".

        Note that Ultrix binaries do not run on OSF/Alpha systems, so this
        will affect any programs you have compiled for Ultrix only.  We will
        continue to support the Ultrix operating system on "volga.eecs".

        Please email if this creates any problems.
  (Jul 15) - Po.eecs downtime and password service changes July 16/17

  PO.EECS: new computer
	Po.eecs will become a new computer at about noon on Fri Jul 17.
	Po.eecs has the master password file, and you have been told to
	login there to change your password.  Po.eecs is now a DEC Ultrix 
	system; it will become a Solaris X86 system.  The benefits are:  
	faster computer, new password server software.

	Your current 'login' password will still be valid after this change.

	On Fri July 17, Po.eecs may be down (off the net) at times.
	From Thu July 16 - Mon July 20, we may prevent any password changes.

  The New Password Service
	The new password software is called "NIS+".  It will allow users at
	the Solaris X86 PCs to change their passwords without logging in to 
	po.eecs.  Users on the HP, DEC and SGI systems will still have to 
	login to po.eecs.  But in all cases, you won't get "password file is 
	busy" messages any more, and the changes will take effect within 5-10 
	minutes instead of 40-60 minutes.  NIS+ has better efficiency and 
	security features than our current password service.

	For more technical info about NIS+, please type "man nis+"  on any of 
	our Solaris systems (for lists, please see
	<a href=""></a>).

	In addition to the 'login' password that you now use, NIS+ uses a 
	second 'secure RPC' (also called 'secret key' and 'NIS+ credential')
	password.  The default 'secure RPC' password for all users is "nisplus".  

	When logging into one of 
<a href="">
	our Solaris X86 computers</a>, you may see a warning message such as 

	    This password differs from your secure RPC password.
	    Password does not decrypt secret key for unix.291@Inst.nisplus.

	This is not a problem, but it will be an advantage to you to make your
	'secure RPC' password be the same as your 'login' password.  You can
	do that by typing logging into "po.eecs" or "torus.cs" and runing the
	Solaris X86 command "chkey".  For example:

	    % chkey -p
	    Updating nisplus publickey database.
	    Generating new key for 'unix.3232@Inst.nisplus'.
	    Please enter the Secure-RPC password for jdoe: nisplus
	    Please enter the login password for jdoe: {jdoe's password}

        This sets the 'secure RPC' password to match the 'login' password,
	so you won't have to type it when you change your password, shell or
	'finger' information on a Solaris X86 system.  We plan to install 
	"wrapper" programs for the UNIX 'passwd', 'chsh and 'chfn' programs 
	to automate the entry of the 'secure RPC' password, but you may still
	see messages about it when you use these programs.  Please notify 
	'<a href="">root@cory.eecs</a>' 
	if you have difficulty using these new programs.

  Use 'ssh' for better security:
	Users on other UNIX computers will still need to run those commands 
	on po.eecs.  For added security, we recommend that you login into 
	po.eecs using the "ssh" program rather than rlogin or telnet.  
	"Ssh" is available on our UNIX computers and is used like "rlogin":

	  ssh po.eecs -l {your_login}

	"Ssh" is commercial software that unfortunately is not available for 
	free.  It can be purchased for PCs and Macs: please see 
	<a href=""></a> for details.
  (Jul 15) - /home/tmp on po.eecs will be unavailable Thu/Fri, July 16/17

        /home/tmp on po.eecs will be unavailable from about 1pm Thu Jul 16
        until 1pm on Fri Jul 17, while we move it to the new po.eecs.
  (July 01) - Saidar /home/d will be inaccessible from 8am-9:30am on July 1

	/home/d will undergo full dumps between 8AM and 9:30AM on July 1.
	During this time the filesystem will not be mounted by any of our
  	client systems.  This was work that was not completed last Friday
	as originally scheduled.  We apologize for the abrupt notice.
  Kevin Mullally, Manager                   Ferenc Kovac, Associate Manager
  EECS Instructional & Electronics          EECS Instructional & Electronics
  378 Cory Hall, (510) 643-6141             377 Cory Hall, (510) 642-6952        
 source: /share/b/pub/reports/manager/Fall.1998 - revised January 31, 1999