Yesterday (Saturday) around 11pm the disk for our UNIX home directories, /home/tmp and /home/submit filled up. We added space at about 9am on Sunday morning, which fixed the problem. Please see Symptoms when UNIX home directories are missing below, which also decribes the bad side effects of a full home directory. File content that could not be saved may, unfortunately, be lost. Previous versions of files may have been backed up in the hidden .snapshot subdirectory of your home directory. You can search it with a UNIX command such as find ~/.snapshot -name my-lost-file.py
The instructional license server (known as License-srv, Scotland and 188.8.131.52) crashed due to a hard drive failure on Sunday night, March 26. It serves these products: Synopsys (Sentaurus, etc) - restarted Monday @3:30pm Xilinx - restarted Monday @3:30pm Mentor (ModelSim, Calibre) - restarted Monday @3:40pm ADS - restarted Monday @4pm Keil - restarted Monday @4pm NI (LabVIEW) - restarted Monday @4pm Maya - restarted Monday @8pm Renderman - restarted Monday @8:20pm Nvidia (MentalRay) - (new, still in progress) The license server went down again at 3am on Tuesday, due to a normal automated reboot for updates that required manual intervention to complete. We restarted it at about 10am and corrected that problem. There was additional intermittent downtime until about 2pm while we installed patches and tested some firewall settings.
Updated Feb 24, 10:20pm The department server for instructional UNIX home directories crashed at about 6pm (Fri Feb 24) and was restored at about 8:30pm. Course WEB sites on http://inst.eecs.berkeley.edu were also down. During that time, you could not login, or access to your files were frozen if you were already logged in. See the EECS department notice for current information about the failure. Please see "Symptoms when UNIX home directories are missing" below for how this may effect you.
After software upgrades to this server in January, WEB sites using CGI (programs to generate the WEB pages) in user accounts may need modifications. We are working to resolve some of the administrator- owned CGIs (Jan 2 2017). See below for coding tips. Also, the syntax of the .htaccess files has changed. .htacess files can be used to implement password-controlled access to the WEB contents of a UNIX directory under your public_html directory. The changes are to comment out (#) several directives if you have them, and replace them as shown below> #AuthzLDAPAuthoritative off #AuthType AuthCAS AuthType CAS #AuthLDAPURL ldap://ildap3.EECS.Berkeley.EDU/ou=people,dc=EECS,dc=Berkeley,dc=EDU?uid TLS AuthLDAPURL ldaps://ildap3.EECS.Berkeley.EDU/ou=people,dc=EECS,dc=Berkeley,dc=EDU?uid Without those changes the content of the directory is inacccessible via the WEB server (https://inst.eecs.berkeley.edu). Howvever, owners of the WEB sites (ie, instructors of classes, etc) can access the files directly by logging into the account on an Instructional UNIX computer. Here are some notable differences between the old and new servers that affect CGI scripts: old: SolarisX86, new: Linux Ubuntu any compiled modules may need to be replaced change /usr/sww/bin/git to /usr/bin/git change /usr/sww/bin/ps2pdf to /usr/bin/ps2pdf change /usr/sww/bin/groff to /usr/bin/groff change /usr/local/samba/bin/smbclient to /usr/bin/smbclient change /bin/sort to /usr/bin/sort change /bin/uniq to /usr/bin/uniq change 'sort +6' to 'sort -k 6', for example The version of Perl has changed from version 5.8 to version 5.18. Here are sym links we have added for backwards compatibility: /bin/ssh -> /usr/bin/ssh /usr/local/bin/perl -> /usr/bin/perl /pool/www/cgi-bin -> /pool/cgi-bin /var/www/cgi-bin -> /pool/cgi-bin /export/www/cgi-bin -> /pool/cgi-bin/ /usr/lib/cgi-bin -> /pool/cgi-bin /usr/local/httpd/cgi-bin -> /pool/cgi-bin/ Please ask firstname.lastname@example.org for help if needed.
The well-known instuctional login server called cory.eecs.berkeley.edu will be down until further notice while we convert it from Solaris to Linux. (Jan 17) You can use one of our other general-purpose UNIX login servers: ashby.cs.berkeley.edu cedar.cs.berkeley.edu derby.cs.berkeley.edu gilman.cs.berkeley.edu heasrt.cs.berkeley.edu oxford.cs.berkeley.edu solano.cs.berkeley.edu ward.cs.berkeley.edu
5:30pm There have been intermittent network outages today. Wired and wireless networks are down in Sutardja Dai Hall (SDH) (CS10 relocated to 277 Soda for the day.) Login shell connections between computers in Soda, Cory and SDH have failed repeatedly after 4pm. Department WEB servers generally have stayed up and the login connections seem to be stable again. We'll post more information it here when we know it. We are watching for updates on https://iris.eecs.berkeley.edu/.
The imail.eecs.berkeley.edu mail server and the related WEB mail client will be down for as long as 30 minutes starting at 3pm today. This is to add memory and disk capacity to the new server. The imail.eecs.berkeley.edu mail server and the related WEB mail client were upgraded on Wednesday April 13. There is now a new WEB mail interface: it is Roundcube instead of Squirrelmail. You may need to re-select your folders in Roundcube. To do that, select "SETTINGS" from the top right corner (or "Manage Folders" from the bottom left gear icon), click on the Folders setting and click in the boxes for the folders you want to add. Those folders are stored in the Mail subdirectory of your UNIX home directory, just as they were for Squirrelmail. You may also need to import your old INBOX, depending on your procmail (~/.procmailrc) filtering. In .forward files, the old setup tolerated a forward slash to spacify delivery to a local address, but the new system (legitimately) wants the backslash, for example: \cs199-zzz, email@example.com Please ask firstname.lastname@example.org for help if needed.
Linux Ubuntu servers ashby.cs, derby.cs, hearst.cs, and gilman.cs are denying logins now, apparently due to loss of their NFS connections to the departmental home directory server. Our other computers are not affected. Students can find lists of the other computers at https://inst.eecs.berkeley.edu/connecting.html#labs. We expect to have this fixed on Monday morning.
The instructional email server (imail.eecs.berkeley.edu) is up. It is receiving incoming messages and relaying outgoing messages for our UNIX computers as normal. However, our WEB mail client (http://imail.eecs.berkeley.edu) will be down for maintenance until Wednesday afternoon. You can still read your email using an IMAP client such as Thunderbird or pine. We expect to have it running again by 5pm Wednesday (March 16). Please see https://inst.eecs.berkeley.edu/connecting.html#email for more information about our email server.
It lasted less than 2 minutes, and no computers needed to be rebooted. The posting was: Save your files often while working on Friday night. On Friday March 11 at 10PM the EECS department network staff will upgrade the switch that provides connectivity to our file servers. All access to your home directories on UNIX will be unavailable for 5-10 minutes. The IT staff may need to reboot some computers after that to re-connect to the file servers. For details, please see https://iris.eecs.berkeley.edu/news/15374-scheduled-storage-outage-fri-31116
Update Jan 8 3pm: This is complete; all UNIX logins are re-enabled (some of our computers will remain down until classes start on Jan 19). -------------------------------------------------------------------- All logins to the Instructional UNIX computers will be blocked from 6pm Thursday - 2pm Friday while we migrate the home directories to a new department file server. Existing users will be logged out. The old home directories will become read-only at about 10am Friday. We will need to log everyone out of the UNIX systems and reboot some of them. When we allow logins again, it will be using the new home directories, with all of the contents of the old home directories. The instructional email server (http://imail.eecs.berkeley.edu) will be turned off from about 10am-2pm on Friday. Email sent to it during that time will be queued by the sending mail server and delivered when imail.eecs comes back up. Questions to email@example.com or 510-643-6141.
The depertment will be closed from noon on Dec 23 through Jan 4. During the winter break, we will monitor inst@eecs periodically. Non-critical requests will be deferred until Jan 4. Instructional labs will be locked, but several login servers will remain accessible from over the net. For details, please see http://inst.eecs.berkeley.edu/End-of-Semester.
After a planned network outage (https://iris.eecs.berkeley.edu/news/14334-network-maintenance-outage-tuesday-128), we had to reboot cory.eecs and inst.eecs (this WEB server).
Dec 7, 11:25am: The license server has been restored to service. It's RAID controller had crashed. Dec 6: The instructional license server scotland.eecs (aka license-srv.eecs) went down earlier today (Sunday). The server runs these licenses: LabVIEW (for campus) Sypopsys, including Sentaurus, SWB, HSPICE (for EECS) Mentor Graphics, Xilinx, Renderman (for EECS) Instructional staff will be on site to look into it starting at 9am Monday.
The network in Cory Hall will be down tonight from about 9-9:10pm. This will disconnect the lab computers and servers from your home direcories, so PLEASE SAVE ALL YOUR WORK before that. This includes the wireless network. For details, please see https://iris.eecs.berkeley.edu/news/14174-cory-hall-network-maintenance-outage
The root drive of this WEB server (or more specifically, the root drive of its host server) filled up at about 10pm last night and stalled the WEB server. Service was restored at 11am today.
Please see http://inst.eecs.berkeley.edu/cgi-bin/pub.cgi?file=lab-safety.help for information about seeking help while you are in one of our labs.
On Sep 18 7pm, the campus network started having intermittent outages. Calnet was restored the morning of Sep 19. Some campus network sites are still inaccessible. For more information, please see http://systemstatus.berkeley.edu/. CS and EECS systems have remained accessible to one another.
The campus IT group (IST) has announced scheduled downtime for all CalNet logins on Sunday Apr 12 (6-8am) and Sunday May 17 (time TBA). For more information, please see http://systemstatus.berkeley.edu/.
If you find that you can't login at the graphics consoles in the Soda Hall labs (rooms 271, 273, 275, 277, 330, 349 Soda), that may be caused by an old .profile file in your home directory. The new Ubuntu 14 operating system on those computers is intolerant of some bash commands in older versions of that file. You can safely rename that file to archive it (ie "mv .profile .profile-old"). Users with this condition are able to login to the same computers using 'ssh' from another computer and using the text-based console (type ctrl+alt+F1 to kill the graphics console).
Rooms 330 and 349 Soda have been renovated and will be re-opened on January 25, 2015. These rooms are intended primarily for EECS upper division classes and student collaboration. This work was funded by the Dept of EECS along with the renovations to 337 Soda, which was recently converted to a collaboration space with cafe-style seating for upper division classes. The renovations in 330 and 349 Soda include: * new paint, new whiteboards * new carpet, network, tables, overhead projector (330 Soda only) * new Dell Optiplex 9020 PCs, with quad-core HT 3.4GHz i7 cpu, 32GB RAM (4x8GB), 500GB SATA drive, GeForce GT 740 1 GB DDR3 PCI Express 3.0 x16 Cuda GPU, ASUS 24" LCD with integrated speakers
This WEB server and the EECS Instructional computers were inaccessible from Jan 3-7 during a network upgrade that took longer than expected. For status updates, please see http://iris.eecs.berkeley.edu/, http://status.eecs.berkeley.edu/. Email sent to imail.eecs was rejected from Jan 3-7; the sending computer will queue the email and resent it when the mail server can back up (at about 2pm on Jan 7).
Between Dec 22 and Jan 19, the instructional labs will be locked and most of the workstations will be turned off. For a list of servers that will be available, please see http://inst.eecs.berkeley.edu/share/b/pub/html/End-of-Semester.html. On Saturday Jan 3, most of our computers will be inaccessible during network maintenance. This includes the Imail mail server and the Inst WEB server.
The Instructianal email/SquirrelMail server (https://imail.eecs) and the SVN server (https://isvn.eecs) were down from about 8am-11:30am. Email that is sent to the mail server during that time is queued by the sender for later delivery. Some workstations in the labs also denied logins. These problems were on systems that had not been fully decoupled from our LDAP service, which went down again this morning. We are working to snuff out the remaining dependencies on LDAP.
(10:30am) This work has been completed. "inst.eecs" has been moved to a new server. We we plan to shut down this WEB server down for maintenance for a few minutes between 10am and 10:30am tomorrow (Tuesday Nov 18). This is to swap the disks to another server, in response to several recent crashes of the existing server. We suspect from the kernel logs that it's caused by bad memory, or possibly a bad motherboard. The server is a Dell 1850 circa 2007. The server rebooted itself 3 times on Nov 17 (9:30am, 1pm, 4pm) and once last Friday (6pm), with downtimes of 10-30 minutes each. This was unrelated to the LDAP problem (below), which has not reoccurred since Nov 12.
Nov 14: For an analysis of the severe downtime events this semester, please see "Analysis of the Repeated Downtime Events" in https://inst.eecs.berkeley.edu/~inst/reports/?file=Fall_2014.pdf. Please post any questions to firstname.lastname@example.org. Thank you for your patience.
Nov 13: We still have a delay in changing passwords, so we recommend that you keep using the one you have for another week or so (we'll update that here.) Nov 12: (12:30pm) The EECS Instructional UNIX systems are stable again, after several weeks of periodic downtime. If you were unable to login to any of our UNIX systems recently, please try it again. If it still fails please tell us (email@example.com) which computer you are trying. Our password server has been down a lot lately, and that confuses people into thinking they have forgotten theirs. We are updating local password files on our computers for the time being. We are testing a new LDAP service (retiring SUN LDAP, impementing OpenLDAP) and will order new servers for it. The current servers are circa 2002, which has contributed to their instability. See below for the symptoms and history of the problem.
(10:45am) An ongoing problem with NFS is preventing the course WEB pages from being accessed through this WEB server. (12:30pm) This problem has been fixed.
The http://inst.eecs WEB server was down from about 9-10:30pm tonight because the inst.eecs server rebooted itself. This seems to be unrelated to the LDAP problem (below), which has not reoccurred since 10:30am today.
(Tue Nov 11) The LDAP server went down again this morning and, although we thought we'd eliminated the dependence of our computers on it, that still (unexpectedly) broke the NFS link to the home dirs. We'll get that fixed tomorrow. In the meantime, we'll try to keep LDAP running. (Mon Nov 10) The LDAP service was up and down this afternoon We have installed local, static files on all of our UNIX systems so that we can take the load off of the LDAP server. While this will impose delays in any password changes, we hope it will keep things stable while we diagnose or replace it. The Imail mail server and SquirrelMail (http://imail.eecs) hang up when LDAP or NFS are down. Email that is sent to the mail server during that time is queued by the sender for later delivery.
The Instructional UNIX systems lost LDAP and NFS (user identification and home dirs) again on Sunday Nov 9 from about 5pm - midnight. The Instructional UNIX computers (Linux, Solaris, MacOSX) and WEB servers (Inst, ISVN, SquirrelMail) were also down. Please see below for symtoms and explanation. It has been a recurrent problem of the server failing; LDAP just stops answering, and it takes 30-60 minutes to restart it. We don't know why it got so bad this semester. We have tried tuning the timeouts and monitoring the client connections. We are testing a new version of the LDAP server software, on a newer computer with more RAM.
The Instructional UNIX systems lost both LDAP and NFS services (passwords and home directories) at about 3:45pm on Wednesday and were unstable until about midnight. The effect on our users was frozen UNIX login sessions or the inability to login, inaccessible home directories and inaccessible WEB sites on http://inst.eecs. The next day, we implemented work-arounds while we debug it. We regret the negative impact that this has had on our students.
The Instructional UNIX systems lost both LDAP and NFS services (passwords and home directories) from about 11am - 10:45pm today. The effect on our users was frozen login sessions or the inability to login. We had to restart a jammed LDAP server, which can take an hour as it rebuilds its database. This has occurred previously this semester, and we are trying to debug it.
The Instructional UNIX systems lost their LDAP password service at about noon today. The service was restored by 2pm (changed from 1pm...), as the redundant LDAP servers rebuilt their databases. The effect on our users were frozen login sessions or the inability to login. It also caused loss of access to some WEB pages on http://inst.eecs.berkeley.edu and delays in email delivery through imail.eecs.berekley.edu.
The Instructional UNIX systems lost both LDAP and NFS services (passwords and home directories) from about 10am-10:20am today. The effect on our users was frozen login sessions or the inability to login. This was caused by a loss of connection to one of our LDAP servers and the time delay for the NFS server to automatically cutover to our redundant LDAP server.
(July 29) There was a network problem from about 9:50am to 10:55am today that prevented our users from accessing their UNIX homedirs and Airbears. For more information: https://iris.eecs.berkeley.edu/news/11953-unplanned-outage-wired-and-wireless (July 14) EECS network staff are performing load testing today to help prevent additional incidents as below. There will be intermittent moments of poor network performance at the EECS border as this occurs. For more information: https://iris.eecs.berkeley.edu/news/11893-intermittent-network-slowness-today (July 9) The EECS Instructional systems experienced intermittent lost connections (for periods of a minute or so every few hours) to the UNIX home directories between July 3 and July 8. The symptom was that it would be slow to login while waiting for initial access to the home directories, then you might get 'command not found' errors if the "dot" files in your home directory had failed to run and set your path. The server support staff corrected this at about 3pm on July 8. For more information: https://iris.eecs.berkeley.edu/news/11813-degraded-performance-for-some-project (June 25) EECS computers experienced intermittent network interruptions of up to several minutes between June 23 and June 25. There were dropped connections between the EECS network and the outside world (including the rest of campus and users on Airbears who are connected to EECS computers). This affected communication in both directions. The EECS network group posts updates at https://iris.eecs.berkeley.edu/news/11673-packet-loss-at-eecs-network
A major security risk has been identified in Microsoft Internet Explorer. Please use Firefox or another broswer until a patch has been released. More information: https://technet.microsoft.com/en-us/library/security/2963983.aspx http://blogs.technet.com/b/srd/archive/2014/04/26/more-details-about-security-advisory-2963983-ie-0day.aspx http://www.fireeye.com/blog/uncategorized/2014/04/new-zero-day-exploit-targeting-internet-explorer-versions-9-through-11-identified-in-targeted-attacks.html http://www.usatoday.com/story/tech/2014/04/28/internet-explorer-bug-homeland-security-clandestine-fox/8409857/
On Tuesday April 15, 199 Cory will reopen as the newly renovated SanDisk Computing Lab. All students in EECS classes are welcome to use this comfortable and collaborative space, which includes 8 new PCs (Windows, 16GB RAM), seating for groups and laptop users and a large LCD display that you can use with your own portable device. Please also join us for the Opening Ceremony with SanDisk on Friday April 18 at 11am.
The EECS networks were restored to service at about 8:45pm. WEB pages on http://inst.eecs were accessible again at 9pm. star.cs.berkeley.edu was rebooted at 12:30am (Sunday) to reset NFS. The original announcement (March 27): All EECS computers will be inaccessible on Saturday March 29 from about 10AM - 6PM during scheduled maintenance to the EECS networks. The EECS network will be down for maintenance on Saturday, so our computers will be inaccessible from the network and from each another. Any users on our systems would experience interruptions and possible loss of data. This includes our email server (imail.eecs) and WEB server (inst.eecs). Email that is sent to our server during that time will be queued by the sender for later delivery. For more information about the EECS network maintenance, please see http://iris.eecs.berkeley.edu. Here's the sign for the labs.
All EECS instructional computers will be offline from Friday Jan 10 at about 5pm through Monday Jan 13 at about 10am Exceptions: Our email server (imail.eecs) and WEB server (inst.eecs) will be down only on Saturday Jan 11 from 10am-6pm. Email that is sent to the server during that time will be queued by the sender for later delivery. The EECS network will be down for maintenance on Saturday, so our computers will be inaccessible from the network and from each another. Any users on our systems would experience interruptions and possible loss of data. For more information about the EECS network maintenance, please see http://iris.eecs.berkeley.edu.
Starting the morning of Dec 19, some of our UNIX systems have denied logins or been missing the home directories. This is caused by a failure in LDAP authentication caused by some new certificates. This was fixed by 1pm today; please notify "firstname.lastname@example.org" (510-643-6141) if you are still unable to login to our systems. You can list our login servers at http://inst.eecs.berkeley.edu/cgi-bin/clients.cgi?choice=servers.
Starting the morning of Dec 5, logins at some workstations on 200 SDH (Macs) and 2xx Soda (Linux) experienced delays and timeouts. It is most noticable with Firefox and Chrome WEB browsers. If the WEB site you want times out, you can usually get it after clicking "Reload" a few times. The browsers may refuse to close when you try to logout. The sypmtoms are more severe on the Macs than on the Linux systems. We are trying to diagnose this will get help from the dept network staff on Friday. We'll post updates here.
The EECS network has been restored to service. Thanks to the EECS network staff for discovering the cause. The network problem occurred from 1:30am - 3:30pm today. It caused delays and timeouts for logins and for access to WEB servers and email servers on EECS computers. AirBears was down. For updates from the EECS network staff, please see https://iris.eecs.berkeley.edu/news/10633-eecs-network-in-a-degraded
If you have received this email, please DELETE it without clicking on the link: ===================================================================== From: UC Berkeley EECS Subject: validate and upgrade to our new Mail hub system. This Email Is from the UC Berkeley EECS Support. We Will Be Making Some Vital E-Mail Account Maintenance Today 12th of November 2013. To avoid your e-mail account been terminated during this upgrade, Kindly Click Here and follow the instructions to validate and upgrade to our new Mail hub system. ===================================================================== This is a phishing scam and is NOT from EECS administrators. You may have found that obvious because of the addresses in the full email header: From email@example.com Wed Nov 13 16:59:23 2013 To: Recipients <firstname.lastname@example.org> From: UC Berkeley EECS <email@example.com>
Printers that are spooled via the "iprint.eecs" server were down from about 8am Saturday (Nov 2) through 10:30am Monday (Nov 4). This includes the printers: lw199@iprint (lw199a@iprint, lw199b@iprint) in 199 Cory lw119@iprint in 119 Cory lwh30@iprint in 200 Sutadja Dai lw274@iprint (print274a@iprint, print274b@iprint) in 274 Soda lw330@iprint in 330 Soda lw349@iprint in 349 Soda All print jobs that were queued during that time have been canceled, and will not count against the user's print quota.
We are transitioning to new Synopsys licenses during October. The TecPlot program is replaced by the Sentaraus Visual (Svisual), and some functions of the INSPECT scripting language will no longer work. For details please see http://inst.eecs.berkeley.edu/cgi-bin/pub.cgi?file=synopsys.help. To give GSIs, researchers and students time to test the new Svisual features, we'll run the new licenses from about noon-6pm each weekday until Oct 25, 2013. After that, Tecplot will become obsolete. Please report questions or problems to firstname.lastname@example.org.
The Instructional email server refused incoming mail from 8pm on Thursday until 10am on Friday. All backlogged mail seems to be delivered now (10:30am Friday). Email that is refused is cached on the senders' computers and retried periodically until it gets through, so emails were delayed but not lost.
There was a campus power failure yesterday. (http://newscenter.berkeley.edu) EECS Instructional services were restored by noon today (Oct 1). Please report any problems to email@example.com
At about 8:15am, a transformer near Evans Hall broke and caused a power outage in Cory, Soda, Etcheverry, Sudarja Dai and other buildings. Soda Hall was closed until about noon. The computers, servers and networks for the instructional labs were restored to service by 2pm. There is a notice about the power failure on the campus IST Service Status page at http://ucbsystems.org/. This power outage was a result of an explosion in a manhole outside of Evans hall which damaged several high-power electrical cables. Campus reported that up to 15 buildings lost power completely and up to 30 more lost partial power.
Please see http://inst.eecs.berkeley.edu/End-of-Semester.
The National Instruments bus came to campus for the day and was visited by about 75 faculty, students and staff.
Yahoo donated 24 servers to instructional computing and bought pizza lunches for EECS students!
Please see http://inst.eecs.berkeley.edu/cgi-bin/pub.cgi?file=lab-safety.help for information about seeking help while you are in one of our labs.
- when you try to login the screen freezes - you see the error message "home directory is /" - session hangs up if you try to 'ssh' into an Instructional computer - unable to read WEB pages from the http://inst.eecs.berkeley.edu - lots of annoying "NFS timeout" error messages on your screen While the server is down, you may not be able to logout in our labs because you can't type any commands. On a SunRay, even turning it off doesn't log you out. The support staff check the labs after events like this to be sure everyone gets logged out. We also post information about the problem at http://inst.eecs.berkeley.edu to help students find out when the problem has been fixed. So all you can really do in this case is to wait until the problem is fixed, go back to the lab (or login to the SunRay server for that lab) and log yourself out, or let us log you out. We disable email receipt and relaying through imail.eecs when the home directory server is down. No mail is lost. Computers that send mail queue messages that are not accepted by a remote server, and they resend the messages periodically until they are received.