University of California at Berkeley Department of Electrical Engineering & Computer Science Instructional Systems Support Group /usr/pub/reports/manager/Spring_1996 Report on EECS Instructional Computing Facilities ------------------------------------------------- Spring Semester 1996 by: Kevin Mullally, Manager of EECS Instructional Systems Ferenc Kovac, Manager of EECS Electronic Support For a desciption of the current status of Instructional UNIX labs and file servers, please read /usr/pub/EECS.facilities (also accessible via Mosaic and gopher). For decriptions of additions and changes to software availability, please read /usr/pub/Instructors.Guide, /usr/pub/software.help and other specific files in the /usr/pub directory (also accessible via Mosaic and gopher). Our WWW home page is: http://inst.eecs.berkeley.edu Improvements in Fall 1995 ------------------------- 'due_dates' is a UNIX utility for instructors to share information about the various due-dates and deadlines. Instructions and current entries can be read with the command "/share/b/bin/due-dates" and via http://www- inst.eecs.berkeley.edu/#instructors. 14 28.8-KB modems were added to existing 32 14.4-KB modems in the Instruc- tional pool. Dialup information and stats are available on-line in /usr/pub/dialups and /usr/pub/reports/modem_use (also accessible via http://inst.eecs.berkeley.edu/usr/pub/). Multimedia lab: Four PowerMacs with 32-MB of memory were purchased and installed in 111 Cory for multimedia and animation design classes (CS39A, CS294-6). We hope to add multimedia PCs this spring. New SG Indys: Carlo Sequin received a donation of 10 SG Indys in August and allocated 7 of them to the Instructional lab in 347 Soda for use by graph- ics classes using GL (CS184, CS1285). January 1996 ------------ New cardkey procedures: Now we automatically activate cardkeys for students who are (1) pre-enrolled in a class via TeleBears and (2) already have a cardkey. Other students stil have to go to 391 Cory, and they may be told to get instructors' approval before access will be granted. On the Instructional UNIX systems, we have started updating our Soda Hall copy of the /usr/sww directories nightly. Previously, we updated our copy only at the end of each semester. We have also cleaned up the software in our /usr/local directories on the various individual computers, so that all computers are consistent. The software in /usr/local will remain unchanged during the semester. These are the versions of commonly- used programs that we have installed in /usr/local: software version location --------------------------------------------------- gcc & g++ 2.6.3 /usr/local/{bin,lib,man} emacs 19.28.1 /usr/local/{bin,lib,man} gdb 4.13 /usr/local/{bin,lib,man} elisp files 19.28 /usr/local/src jove 4.14.10 /usr/local/{bin,lib,man} scm 4e1 /usr/local/{bin,lib,man} --------------------------------------------------- Scheduled server maintenance: We'd like to schedule regular, preemptive down time on the UNIX file servers. When possible, we will defer maintenance until these times. We'll post the schedule on-line and in the labs. Here is the schedule: File server when (max down time) software affected -------------------------------------------------------------------- Cory.EECS 1st Friday of month (4pm-6pm) gopher server Po.EECS 2nd Friday of month (4pm-6pm) /home/tmp Parker.EECS 1st Friday of month (4pm-6pm) HTTP server, TMA, Xilinx, Mathematica Cochise.CS 3rd Friday of month (4pm-6pm) /usr/sww Franklin.CS 4th Friday of month (4pm-6pm) /home/tmp2, Powerview -------------------------------------------------------------------- Please notify Kevin Mullally if this would interfere with your plans. Intel has also officially approved our joint proposal for 84-90 PCI bus PCs for instructional use. Delivery is expected in late January. Here is our plan for those new machines: 1. Populate a room in Soda, under the direction of John Canny. 2. Upgrade PCs in 204b and 123 Cory. 3. Install PCs in new multimedia lab in 111 Cory. 4. Provide PCs for the new lab, EECS20, in 117 Cory We are planning to offer a limited number of HSPICE/DRSPICE (currently undergoing evaluation) and Xilinx/Workview licenses to students while enrolled in EECS classes using the software. Loans will be coordinated from 377 Cory. We will need cooperation with the GSIs and the faculty to make sure these keys get returned. PC versions of Matlab and Simulink, along with various toolboxes and blocksets, will be installed in many Cory Hall PCs in January. 01/17/96: Pasteur.eecs, the Instructional email server, stopped processing email at about 5pm on 1/17; service was restored by 9am on 1/18. February 1996 ------------- 02/26/96: Powerview libraries were installed on theh local disks in 273 Soda. Larger disks were installed there recently to handle that. 02/27/96: Pasteur.eecs, the Instructional email server, stopped receiving email at about noon on 2/27; service was restored by 1:30pm. 02/27/96: Ara.eecs, the server for the DEC workstations in 119 Cory, crashed due to a corrupted disk and was outr of service for about 24 hours (noon on 02/27/96 through noon on 02/28/96). Eleven of the DEC workstations in 199 Cory were unaffected. Cochise.cs, one of two Instructional servers in Cory Hall, suffered failures with increasing frequency during February. This was due to the overload of activity caused by the combination of NFS and local login processes. The server is an HP 9000/755 with 128 MB or RAM and 16-GB of disk space. There are over 2330 user accounts on those disks, with the /home/bb and /home/gg filesystems taking the most hits in NFS-write traffic. After consulting with the CS Lower Division faculty, we disabled direct logins to Cochise.cs on Feb 29. This has kept the load down and the NFS performance has been acceptable and dependable. March 1996 ---------- 03/07: The /home/jj disk on Franklin.cs failed in the mid-afternoon, causing Franklin to crash and be down for a couple of hours. The data on the disk will be restored but will not be available until 03/08. 03/15: Cochise.cs became very slow from about 1am - 3am. We determined that the nightly dumps, which run at that time, had found some bad sectors on one of the disks and were generating the high system loas while reading that disk. We will reformat or replace the disk asap. 03/15: Pasteur.eecs, the Instructional email server, stopped receiving email at about 2am; service was restored by 4am. April 1996 ---------- 04/10: Franklin.CS and Parker.CS were rebooted to clear high load averages that were caused by numerous users and zombie processes. The cause for the zombie processes is most often attributed to processes locked in device wait, which in turn is most often caused by network latency and delays in response from other file servers. 04/15: Cory.EECS froze up sometime on the afternoon of Sunday 4/14; service was restored by about 9am Mon morning. Parker.EECS was very slow on Monday morning. It had a load average of 514 (yes!) caused by many runaway "http" processes (Parker.EECS is the Instructional WWW server.) The WWW server was disabled for most of the day waiting for us to resolve the problem. The Ara cluster of DEC workstations in 119 Cory was until about 5:30pm while a software problem on Ara.EECS was repaired. 04/18: From 9am-noon, 3 Cory Hall subnets (133, 134, 138) were unexpectedly disconnected for repairs to routers. The work was done by campus DCNS staff and apparently there was miscommunication between EECS and DCNS staff about the scheduling. The repairs disconnected the Cory Hall Instructional labs and Instructional *.EECS systems from the rest of the world (including Soda Hall). Instructional email and many home directories were inaccessible during that time. Email deliveries were delayed during that time. 04/24: Po.eecs crashed at about 2am with a disk failure; we had a new disk in place by 10am and began to restore the data. Po was down until 3pm, then down again from 5pm-6pm as we fixed this. The /home/g filesystem was unavailable until about 9pm. 05/16: The /home/pp disk failed without warning overnight. At 8am, we bagan file restoration from tape to another disk. The CS186 accounts were located on that disk, and it was awful luck that CS186 had an exam at 5pm that day. Unfortunately, the disk contained 900 MB of data and restoration of files was not completed until about 6pm. (end of document)