How to Set up a Home Page
on the EECS Instructional Computers

Topics:
  • What is a "home page"?
  • How to Set up a Class Home Page
  • WEB services for Courses
  • HTML Code Sample: a Simple Homepage
  • HTML Code Sample: index.html for a Class WEB Page
  • HTML Code Sample: archives.html for a Class WEB Page
  • How to "Publish" using an HTML Editor
  • Debugging Tips
  • Updating your .htaccess files
  • Features:   SSI, CGI, PHP, GD, sym links, redirect
  • Restricting Access to your WEB Site
  • Usage Policies for Information Servers
  • General References about WWW utilities
  • Apache HTTP Server Project Home Page
  • Apache Manual for this WWW Server


  • What is a "home page"?

    A "home page" is a text document that may also contain references ("links") to other home pages and to text and graphics documents. It is generally a single page or the top level of a hierarchy of pages on a specific subject (such as you). The source text file for it may be browsed by selecting the "View Document Source..." option of your
    Web browser. The special features of a WWW home page are defined by simple HTML commands that are typed into the source text file.


    How to Set up a Class Home Page (for instructors)

    There are several WEB services for courses.   Our recommendation to EECS instructors is:

    1. create a course site on the EECS Inst server
    2. use the EECS Inst site for posting public content
    3. use the EECS Inst site for dynamic content (CGIs)

    In addition, if you wish to communicate on-line with the students and post password-protected content:

    1. create a course site on the campus course management system: bCourses
    2. make a publically-viewable information page on bCourses, perhaps by displaying the Syllabus or a link to your class site on the EECS Inst WEB server
    3. advise students to login there and join your course site

    For a summary of the course WEB services at UCB, see WEB services for courses below.

    The EECS Instructional WEB server:

    EECS Instructional Support manages computer accounts for EECS course instructors. These are known as instructor or master accounts. The instructor accounts on our UNIX computers store the files that are displayed via the EECS Instructional WEB server (http://inst.eecs.berkeley.edu).

    To edit the WEB pages, instructors logon to an Instructional UNIX login server such as cory.eecs.berkeley.edu using SSH or putty, or at a workstation in an Instructional UNIX lab.   (inst.eecs.berkeley.edu does not allow direct logins.)

    We can use SSH keys to enable a person to login into the course account and the related WEB pages on UNIX with a private password. This is a useful feature for GSIs. This will allow the GSI to login or copy files there. The GSI can run the command ~kevinm/bin/sshkey-maker on login.eecs (research UNIX server) or ~inst/bin/sshkey-maker on cory.eecs (instructional UNIX server). It emails the SSH public key to us, and we add the public key to the ~/.ssh/authorized_keys file in the course account.

    Instructors and TAs may create a home page for a class, and we will add a reference to it from the list of courses on the Instructional WEB site. The site files can be edited by logging into the associated account on the Instructional UNIX computers (cory.eecs.berkeley.edu, etc) or by other publishing techniques. We encourage instructors to store their class WEB sites in the instructor's UNIX account. That makes the WEB site accessible via http://inst.eecs.berkeley.edu, and it means we'll retain the data for the future.

    To login to the instructor account:

    There are 3 possible ways to access the files in the instructor's UNIX account. Each uses a different password:

    password to set or change it allows you to
    SSH
    • login to an Instructional server using ssh, Putty, scp or WinSCP
    • use UNIX command-line editors (vi, emacs) or transfer files
    UNIX "LDAP"
    • ask inst@eecs to set it
    • login to update.eecs.berkeley.edu (UNIX) to change it
    • login at Instructional UNIX workstations
    • login to the course WEB site (https://inst.eecs.berkeley.edu)
    Windows
    • ask inst@eecs to set it
    • Login and press CNTL-ALT-DEL to change it
    • login at Instructional Windows workstations
    • "share" the UNIX home directory from another account on an EECS Windows system by mapping a drive to the folder "\\napinst\{account-name}" (such as "\\napinst\cs123").   Select "Connect using a different user name" and logon using the Windows instructor account name (ie "cs123") and password.   Note that this has to be the only user that is connected to "napinst" from your current Windows session.

    We will reset the UNIX and Windows passwords for instructors so they can login to the account at a workstation in our labs as well as over the network (using 'ssh' from UNIX or 'putty' from Windows).   You need this password to login to a course WEB site using the "Login..." buttons (located at https://inst.eecs.berkeley.edu/~COURSE/login, where COURSE is "cs123" or etc).

    Instructors and TAs can also use their own "SSH" passwords to login to the instructor account.   (This password does not allow you to login at a workstation; you need the UNIX "LDAP" password for that.)   You can generate an SSH public key and send it to us so we can install your SSH public key there.

    1. To enable "SSH" logins from UNIX, you can use login.eecs (the UNIX server for EECS research accounts):

      login to login.eecs.berkeley.edu
      type the command ~kevinm/bin/sshkey-maker

    2. To enable "SSH" logins from Windows:

      open the Putty program
      generate a public key (see putty.pdf)
      email the public key to inst@eecs.berkeley.edu

    We will install it in the instructor account (using /share/b/adm/bin/sshkey-installer) and respond to you.

    Please ask inst@eecs.berkeley.edu if you need help logging in.

    Course WEB sites must meet several requirements:

    1. Unattended course WEB sites from previous semesters should be depreciated so that they do not appear to be current information.
    2. Instructors of other courses need to see the contents of the WEB site from previous semesters, to verify what the current students should have learned.
    3. Homework and test solutions from previous semesters may need to be blocked by the current instructor.
    4. The EECS CNIL faculty committee has requested that EECS Instruction support these requirements.

    Course WEB sites have a specific UNIX directory structure:

    To support these requirements, authors of course WEB sites should follow these practices:

    • The WEB pages for each semester should be installed in separate directories, ie
                the "Spring 2005" site goes in ~cs123/public_html/sp05
                the "Fall 2005"   site goes in ~cs123/public_html/fa05

      The WEB files for each semester should be stored under public_html in a subdirectory that has a standard 4-char lowercase name such as fa04, su04 or sp05.   (The definition is [fa|sp|su][0-9][0-9].)   This is required for the automated maintenance of these sites.

    • The goal is that each semester site will be a self-contained and relocatable package, such as if it were tarred up and moved. So links to local files should be relative (ie './lec1.pdf') rather than absolute (ie not 'http://inst.eecs/~cs123/lec1.pdf')

    • public_html/index.html redirects to the current semester directory for the class.   It is modified automatically by ISG.

    • public_html/archives.html is a record of the locations of WEB sites from previous semesters.   It is modified automatically by ISG.

    • For example (click on the file names for samples):
          % cd ~cs164
          % ls -lad public_html public_html/*
          drwxr-xr-x   7 cs164    cs164   4096 Jul 13 16:16 public_html
          -rwxr-xr-x   1 cs164    cs164   1229 Jul  8 11:44 public_html/index.html
          -rwxr-xr-x   1 cs164    cs164   1205 Jul 13 16:07 public_html/archives.html
          drwxr-xr-x  11 cs164    cs164   4096 Jul 13 16:12 public_html/fa04
          drwxr-xr-x  12 cs164    cs164   4096 Jul 13 16:16 public_html/fa02
          drwxr-xr-x  14 cs164    cs164   4096 Jul 13 16:14 public_html/sp04
          drwxr-xr-x   9 cs164    cs164   4096 Jul  8 11:36 public_html/sp05
          drwxr-xr-x   2 cs164    cs164   4096 Jul  8 11:43 public_html/fa05
          -rwxr-xr-x   2 cs164    cs164   4096 Jul  8 11:43 public_html/fa05/index.html

    • If you have any files that are shared by different semesters, it is best to either copy them into each semester directory (if they aren't too big) or install them in a common directory and install a sym link to that from each semester directory, ie
          ~cs123/public_html/common
          ~cs123/public_html/sp05/common -> ../common
          ~cs123/public_html/fa05/common -> ../common
      References would be relative, ie <a href='./common/lec1.pdf'>. That way, it will be clear in the future that these files were outside of the semester package, and the entire 'common' subdirectory could be copied into the semester directory to make it work again elsewhere.

    • A UNIX command for copying the previous semester WEB site:
          cp -rp ~cs123/public_html/sp05 ~cs123/public_html/fa05
      and this UNIX command edits the file that redirects to it:
          (echo ':g/sp05/s//fa05/'; echo ':x') | edit ~cs123/public_html/index.html

    • If you don't want a previous semester site to be readable, just block access with 'chmod 700 sp05', etc.   We can also set up password-controlled access to the previous sites.   Please ask inst@inst.eecs for help if needed.

    • If your class WEB site is really somewhere else (not in the instructor's account on http://inst.eecs.berkeley.edu), you should still create a subdirectory for the semester, then install an index.html file in it that redirects to your alternate site. For example:
          ~cs123/public_html/fa05/index.html
      could simply contain:
          <META HTTP-EQUIV='Refresh' CONTENT='0;URL=http://myserver.berkeley.edu/~mysite'>
      This index.html will remain unchanged on the Instructional server in future semesters, so we will always have a reference to where the WEB site was located.

    Permissions to class WEB sites:

    You typically set the permissions with these UNIX commands, for example:

          % chmod 711 ~/				# your top level home directory
          % chmod 711 ~/public_html 
          % chmod 711 ~/public_html/sp05
          % chmod 755 ~/public_html/index.html 
          % chmod 755 ~/public_html/sp05/index.html
    This allows everyone in the world to read those files, including people who are using a WEB browser and those who are simply logged into an Instructional computer.   If you want users to be able to list the files in a directory:
          - run "chmod 755 directory-name" to set the read bit
          - do not put an "index.html" file in the directory

    For directories, the "1"s in "711" set the execute ("x") bit but not the read ("r") bit.   The "x" bit on a directory allows access to a specific file within the directory, but it does not allow a listing of all the files.   By default, the WEB server looks for an "index.html" file and can read that under "711", but it can't list a directory that has permissions "711".

    For ways to add security, please see Restricting Access to your WEB Site below.

    Note that old WEB sites may contain homework solutions or other information that the current instructor may not wish to reveal.   In that case, we recommend that the current instructor block access to the old WEB site with a UNIX command such as

          chmod 500 ~cs123/public_html/fa04

    Adding links to Class newsgroups:

    You can include a WEB link to the class newsgroup with this HTML code (using "cs152" as an example):

    <A HREF="news://news.berkeley.edu/ucb.class.cs152">CS152 Newsgroup</a> This results in:  CS152 Newsgroup.  Access is restricted to on-campus computers, however.   Please see http://inst.eecs.berkeley.edu/connecting.html#news for information about on-campus and off-campus access to the campus news server.

    Basic Class WEB page:

    /share/b/pub/sample.class.html is an HTML file that may be used as a template for a new home page.   Please notify the Instructional Group (inst@eecs.berkeley.edu) if you have a new class home page that you would like us to install.


    WEB services for courses:

    There are several UCB WEB servers that EECS instructors can use to post course materials.   Here is a summary.

    EECS Instructional WEB sites (http://inst.eecs.berkeley.edu/classes-eecs.html)
    All EECS courses have a default WEB site on the EECS instructional WEB server.   The files are stored within the instructor's UNIX account. The contents are authored and maintained by the profs and TAs.   CGIs and directory-level access restrictions can be used.   Files are backed up to tape and archived each semester.   Tech support is from the EECS Instructional IT staff (inst@eecs.berkleley.edu).   See above for details.   Here is an example of a course WEB page (for a non-existent class): inst.eecs.berkeley.edu/~cs123.   We recommend that instructors:

    • list any external WEB sites used by the course on this WEB site, so that there is a permanent record of where the content is for that semester.
    • use this WEB site for content that should be public, permanent and archived.
    • use bCourses (below) for on-line communication with the current group of students.

    EECS Scheduling WEB sites (https://eecs.berkeley.edu/academics/courses/)
    Lists EE and CS courses with links to the course descriptions.   These sites are automatically generated on the EECS department WEB server from campus sources.   They include current and next semesters, and seminar classes (*94, *98).   Instructors can edit department notes for courses on these pages at the My EECS Info link on the EECS department WEB server.   Instructors will see a Courses section with courses that they are currently teaching. Click the pencil icon next to the course name to edit its description. For help about that, please email acg@eecs.berkeley.edu.
    CalCentral   (http://calcentral.berkeley.edu)
    CalCentral is the UCB portal to an increasing suite of integrated Calnet-enabled tools for students and staff, including bCourses, bMail, bDrive and course registration.
    bCourses   (http://bcourses.berkeley.edu)
    bCourses is the UCB learning management system.   It integrates features from the former bSpace, CourseWeb, Blackboard, WebCT and Library ERes services.   bCourses is managed by the UCB Educational Technology Services (https://www.ets.berkeley.edu/, bcourseshelp@berkeley.edu).

    All UCB students and staff can login to bCourses using their pre-existing CalNet ID.   (non-UCB students, please see calnet.help for help.)   When they login, instructors are automatically associated with their current classes (by the Registrar) and are authorized to manage sites for their courses.   Students select courses from the pre-assigned "Courses and Groups" list.   They are assigned to a course if they are enrolled in it or if the instructor has added them using the bCourses "People" tool.

    Please see above for ideas about which content is appropriate for bCourses vs the EECS WEB server. Also see What's the best choice for an online collaboration tool? for more information.

    Create a world-readable reference on bCourses:
    If you use bCourses, we recommend that you create a simple reference page there that is "publicly viewable", that can be viewed without logging into bCourses.   For instructions, please see How do I customize visibility options for course content?.   The URL to the public site can be found in the address bar of your browser.   (You can't search a list for it, because the Public Course Index feature has been disabled in bCourses.)   Include that URL in your course home page on the EECS WEB server.   This will create a permanent record of your use of bCourses that semester.

    Bearfacts   (http://bearfacts.berkeley.edu/)
    BearFacts is provided by the Registrar and IS&T.   It contains student information for terms prior to Fall 2016.   In Fall 2016 it was replaced by CalCentral.
    CalShare   (https://content.berkeley.edu/)
    This service is provided by IS&T for a fee.   It lets authorized users create, manage and build collaborative web sites and make them available to other users of CalShare.   It is a UCB's implementation of Microsoft's SharePoint Technologies.
    Research Hub   (https://content.berkeley.edu/)
    Research Hub provides tools for content management, collaboration, managing research data and sharing documents.
    Pantheon   (https://content.berkeley.edu/)
    Anyone with a CalNet ID can build a WEB site using Drupal (free for test sites, $25/month or more for production sites).
    Berkeley Open Academy   (https://content.berkeley.edu/)
    Site for campus departments to quickly build and maintain a polished academic website, with CalNet ID (CAS) authentication, integration with the UC Berkeley events calendar, and a starter theme developed by University Relations. Based on the Pantheon Drupal environment.
    CalWeb, WebFarm, AppFarm   (http://ist.berkeley.edu/services/catalog/web)
    These services are provided by the IS&T for a fee.   There are several levels of WEB hosting services on UNIX and Windows servers that campus users can select.



    HTML Code Sample: index.html for a Class WEB Page

    This is a simple WEB page that can be edited with text editor such as "vi" or "emacs". This file redirects to the current WEB page for the class.

    <HTML> <HEAD> <META HTTP-EQUIV="X-instructional-class-redirect" CONTENT="CS123"> <TITLE>CS123 Home Page</TITLE> <META HTTP-EQUIV="Refresh" CONTENT="0;URL=./sp05"> </HEAD> <HTML> <BODY> <CENTER> <A HREF="http://www.berkeley.edu/">University of California at Berkeley</A> <BR> <A HREF="http://www.eecs.berkeley.edu/">Dept of Electrical Engineering &amp; Computer Sciences</A> <BR> <H1>CS123<BR></H1> </CENTER> <P style="line-height:1.5"> This page should jump to the current WEB page for this course. &nbsp; If not, please visit <a href="./archives.html">the WEB site archive list<a/>. <P style="line-height:1.5"> For information regarding this course: <A HREF="http://schedule.berkeley.edu/">Course Catalog and Schedule of Classes</A> </P> </BODY> </HTML>
    Set the file permsssions with these UNIX commands:

      % chmod 711 ~/				# your top level home directory
      % chmod 711 ~/public_html
      % chmod 755 ~/public_html/index.html
    

    The timer before jumping is set by the "0" in this line:

    <meta http-equiv="Refresh" content="0;URL=./sp05"> If you set it to 0, the jump is immediate.


    HTML Code Sample: archives.html for a Class WEB Page

    This file is maintained automatically by a script. It has a list of previous class WEB sites for the class.

    <HTML> <HEAD> <META HTTP-EQUIV="X-instructional-class-archives" CONTENT="CS123"> <TITLE>CS123 Home Page</TITLE> </HEAD> <BODY> <CENTER> <A HREF="http://www.berkeley.edu/">University of California at Berkeley</A> <BR> <A HREF="http://www.eecs.berkeley.edu/">Dept of Electrical Engineering &amp; Computer Sciences</A> <BR> <H1>CS123<BR></H1> </CENTER> <P style="line-height:1.5"> <P style="line-height:1.5"> Prior semester archives: <!-- A SCRIPT WILL INSERT NEW ENTRIES HERE --> [<A HREF="sp05/">Spring 2005</A>] <P style="line-height:1.5"> For information regarding this course: <A HREF="http://schedule.berkeley.edu/">Course Catalog and Schedule of Classes</A> </P> </BODY> </HTML>
    Set the file permsssions with this UNIX command:

      % chmod 755 ~/public_html/archives.html
    


    HTML Code Sample: a Simple Homepage

    You can create a simple homepage by using a UNIX text editor (such as "vi" or "emacs") to enter this HTML code and save it to a file called "index.html" in the "public_html" subdirectory of your home directory:
    <HTML> <HEAD> <TITLE>My Home Page</TITLE> <CENTER> <H1>Welcome to <I>My</I> Home Page</H1> </CENTER> </HEAD> <BODY> <P style="line-height:1.5"> Here is some text about me. </BODY> </HTML>
    Set the file permsssions with these UNIX commands:
      % chmod 711 ~/				# your top level home directory
      % chmod 711 ~/public_html
      % chmod 755 ~/public_html/index.html
    

    The http://inst.eecs.berkeley.edu WEB server will display that file using the URL http://inst.eecs.berkeley.edu/~yourlogin.

    You can see examples of other people's HTML code by selecting the "Page Source" option that is available in most WEB browsers. Many people use a graphical WEB page editor such as Netscape Composer or Microsoft FrontPage (see Publishing, below), but there is no shame in coding it by hand!


    How to "Publish" using an HTML Editor

    Your WEB site files are under the "public_html" directory in your UNIX home directory. Your file called "public_html/index.html" is your default home page on our WWW server (http://inst.eecs.berkeley.edu).

    There are 3 ways create and update you WEB pages:   edit on UNIX,   edit on Windows and   file transfer

    1. Edit on UNIX:
      Login directly to an Instructional UNIX system (such as "cory.eecs.berkeley.edu") and edit the HTML files in public_html with a text editor (such as 'emacs' or 'vi').

      You can view your changes via a WEB browser.


    2. Edit on Windows:
      From an Windows system in the EECS Windows domain, you can connect to your UNIX home directory and edit the files as if they were local files on your Windows system. If your UNIX account is called "cs123" you would:
           1) Disconnect from any drives on Mamba if you have them
           2) Open the Start\Run dialog box and type
      	net use X: "\\mamba\cs123" /persistent:no /user:EECS\cs123
      You'll be asked for the cs123 UNIX password. Once the directory window pops up, you'll be able to access the files there on the X: drive, and the "cs123" WEB page would be X:\public_html\index.html. If you are on Windows, this seems like the easiest in way to "publish" to the WEB.

      You do need to know which file server your home directory is on. The only choice for Instructional accounts is "mamba". We are running Samba to emulate the Windows filesystem on this UNIX server.


    3. File transfer:
      Another way is to edit the HTML files at your local computer and explicitly save the files on your local disk. Then use a secure ftp to copy them into the public_html directory in your Instructional UNIX account. You can logon to one of our UNIX computers (such as cory.eecs.berkeley.edu) using

    4. rsync on UNIX and MacOSX. "rsync" is generally included in the operating system. We recommend the rsych options "-azH", for example: rsync -azH my-WEB-pages cs123:/cory.eecs.berkeley.edu:public_html/sp12 where -a recursive, copy links as sym links, preserve permissions, preserve times, preserve group -z compress file data during transfer -H recreate hard links
    5. scp on UNIX and MacOSX. "scp" is generally included in the operating system.
    6. WinSCP on Windows. You can download this program from download-ssh.html.
    7. Insecure FTP is not allowed: UNIX and MS Windows systems typically have the traditional "ftp" command that is insecure becuase it transmits your password in unencrypted, clear-text format.   The Exceed package on MS Windows has a graphical version of this.   Campus security rules prohibit the use of this ftp.   This insecure version of ftp uses network port 21, and campus computers deny the use of hat port.



    Debugging Tips

    If you are getting an error message from a WEB page or CGI program that you are displaying via http://inst.eecs.berkeley.edu, you may find clues about the problem by searching for either your login name, the WEB page name or the program name in the Server Access and Error Logs.

    Here are some common error conditions and solutions:

    "Internal Server Error" error This is the generic error message that is usually caused by a CGI problem.
    The first thing to verify is that the CGI program is creating valid HTML output:

    Login to a server such as cedar.cs.berkeley.edu (an Ubuntu Linux system like inst.eecs) and run the CGI program on the UNIX command line.   If you get an error, there may be a bug in your CGI source code.

    If you created or copied your file on a Microsoft Windows system, the file may have newlines or other characters that don't work on UNIX. You can convert the Windows file to UNIX format with the UNIX command (for example):

    dos2unix windows-file.cgi unix-file.cgi

    Next, you can redirect the output to a file with commands such as

    ./unix-file.cgi >! test.html
    chmod 644 test.html

    and then read the "test.html" file as a URL via http://inst.eecs.   If that file fails, then there is probably a bug in your HTML text output.

    Finally, if your CGI is in Perl, you can get the WEB server to pass the real error message to the screen from your CGI program by using the "CGI" Perl module.   Put these CGI lines at the start of your Perl program:

    use CGI;
    use CGI::Carp 'fatalsToBrowser'; # echo fatal error messages to browser

    See http://search.cpan.org/author/JHI/perl-5.8.0/lib/CGI.pm for documentation about the Perl CGI module. "Premature end of script headers:" error If the CGI output is good, then the problem is usually caused by file permissions or ownership.

    Be sure that the permissions on your CGI program and all directories above it are set with

    chmod go+rx,go-w file_name
    chmod go+x,go-w directory_name

    That is readable and executable by the group and all other users but writable *only* by the owner.   The restriction that it can't be group or world writable is a security feature of the Apache server.   See Restricting Access to your WEB Site if you would like to set more restrictive permissions.

    Also be sure that the owner of the HTML file or CGI program is the same as the owner of the WEB site.   For example, in the URL http://inst.eecs.berkeley.edu/~jdoe/test.html, the file "test.html" must be owned by user "jdoe".   This restriction is also a security feature of the Apache server.


    Updating your .htaccess files

    The inst.eecs.berkeley.edu WEB server was updated in January 2011 with a new version of the "modauth" module, which handles access control via .htaccess files. As a result, you may need to update your .htaccess files.

    The new version of Apache (2.2.17) changes some of the directives that are used in .htaccess files that control access to WEB sites. The changes are that the "AuthBasicProvider" line should be added and the "AuthDBMAuthoritative" line should be removed.

    Here is a typical updated .htaccess file:

    	SSLRequireSSL
    	AuthName "An authorized account is required..."
    	AuthType Basic
    	AuthBasicProvider dbm file		
    	AuthDBMType GDBM
    	AuthDBMUserFile  /pool/www/data/master-access
    	AuthDBMGroupFile /pool/www/data/master-access
    	#AuthDBMAuthoritative off			# this line is obsolete
    	AuthUserFile  /home/ff/cs123/public_html/login/SSL/users
    	AuthGroupFile /home/ff/cs123/public_html/login/SSL/groups
    	Require group allow
    

    The *UserFile and *GroupFile lines define what sources will be searched to authenticate the users.

    The "master-access" DBM file contains the users and groups from the Instructional UNIX systems.

    The files "users" and "groups" are files that you can create with login/password matches of your own invention. You can locate these files anywhere under your own public_html directory. Include the full path to them after the AuthUserFile and AuthGroupFile directives.

    The Require line defines which users within those sources will be accepted. In the Require line:

    	"valid" = all UNIX accounts (taken from the dbm password service)
    	"allow" = a group of users that may be listed in the "groups" file
    

    Please see
    http://inst.eecs.berkeley.edu/setup.html#restrict for more information about using .htaccess files.

    If you find an error on one of your WEB pages, please send email to inst@eecs.berkeley.edu with the URL of that page and a description of the content that is incorrect. Thank you.


    Features of this WWW Server

    Home pages:

    Any user with an account on the EECS Instructional UNIX systems can display WWW documents through the
    EECS Instructional WWW server. in this document for the basic steps in creating a default home page in HTML.

    Server-side "includes" (SSI):

    The HTML "include" directive causes other files to be read and executed by the WEB server when it reads a *.html file. We have enabled this feature on the EECS Instructional WWW server.

    To enable the "include" directive, the html file must have world-executable permissions. The UNIX command "chmod 755 *.html" will set those permissions on all files ending in "html" in the current directory. The UNIX command "/share/b/bin/fix-html" (on the Instructional systems) will update your entire Instructional WEB site with these permissions.

    For example, if you have the files "index.html", "header.html" and "hello.cgi" in your public_html directory and you wish to include the html code from "header.html" and from "hello.cgi" in your "index.html" file, enter these lines in "index.html":

    <!--#include virtual="header.html"-->
    <!--#include cgi="hello.cgi"-->

    and make "index.html" (and "hello.cgi") executable with the commnd:

      % chmod 755 ~/public_html/index.html ~/public_html/hello.cgi
      

    PHP:

    PHP is an Open Source general-purpose scripting language that is especially suited for Web development and database access.

    PHP commands can be run in 2 ways through the http://inst.eecs.berkeley.edu server:

    1. Embedded into HTML:   PHP commands embedded in an HTML document use the PHP Apache module and run with the permissions of the WEB server, which is the unprivileged 'nobody' account. In this method, the filename must end in ".php", and HTML "include" directives do not work. The embedded PHP commands only have permission to perform operations that can be done by 'nobody' (such as when reading or writing files).

    2. In a CGI:   CGI programs are run by the WEB server 'suexec' module, which causes them to run with the permission of the owner of the WEB site. In this method, the filename must end in ".cgi", must be world-executable and starts with the line
      	#!/usr/bin/php 
      It will invoke the 'suexec' module, and the commands in the CGI program will have permission to perform any operations that you are allowed to do (such as reading and writing files that are only accessible by you). For an example, run: http://inst.eecs.berkeley.edu/~inst/php-suexec.cgi

      You cannot login directly to the inst.eecs WEB server, but you can test your php programs on one of the Instructioal CentOS login servers.

      To see the options that are configured into the local PHP progam, see /usr/local/lib/php.ini.

    The MySQL Functions (http://www.php.net) are installed on both of these implemetations of PHP. These include
    	"MySQL Functions" (includes mysql_connect, mysql_open, etc)
    	"MySQL Functions (PDO_MYSQL)" (for MySQL v4.1.3 and above)
    	"MySQL Improved Extensions" (for MySQL v4.1.3 and above)

    Only the CGI method has permission to write a file into your home directory.

    Note that there is a problem with mixing the .php and .cgi methods indiscriminantly. Session variables created by one method cannot be referenced by the other. This is because the /var/tmp/sess_... file created by session-variable used in a .php script has a different owner from the one created by a .cgi script.   [thanks to Prof Hilfinger for this]

    GD:

    GD is an ANSI C graphics library for the dynamic creation of images. GD creates PNG and JPEG images, among other formats. GD does not create GIF images. GD allows a program written in C, PHP, Perl, Tcl and other langauges to quickly draw images complete with lines, arcs, text, multiple colors, cut and paste from other images, and flood fills, and write out the result as a PNG or JPEG file. The output files can be used in WEB pages.

    For basic instructions, see http://www.boutell.com/gd/manual2.0.11.html#basics
    For more information, see http://www.boutell.com/gd/.

    CGI scripts:

    Users may run their own CGI programs through the inst.eecs.berkeley.edu server. SSI (server-side includes) and "exec cgi" are enabled. Here are the rules that a CGI program must follow on our server:

    1. The program name must end in ".cgi".
    2. It can be located anywhere under your public_html/ directory.
    3. Its user and group ownership must match that of the owner of the WEB site (you).
    4. The permissions on the program and all the directories above the program must be world-executable (for a way around that, see Restricting Access to your WEB Site below).
    5. It must generate proper HMTL code, in which the first line starts with 'Content-type:' and the second line is blank.
    6. It must run on the WEB server inst.eecs.berkeley.edu, which runs the Ubuntu Linux operating system. See /share/b/bin/clients for a list of servers that you can login to that have the same operating system.

    For CGI scripts, be sure the command on the first line exists on the WEB server. These are the most likely choices:

    /bin/csh
    /usr/bin/perl
    /usr/bin/python
    /usr/bin/python3

    Scripts in those languages will generally run the same on the different UNIX operating systems, but compiled programs (such as in C++) will not.

    CGI example:
    Here are examples of simple CGI scripts called hello.cgi, written in the bash shell, Perl and Python3.   In each case, end the filename with a ".cgi" extension (not .sh, .pl or .py) so the WEB server will know it's CGI:

    #!/bin/bash
    echo "Content-type: text/html"
    echo ""
    echo "</HTML>"
    echo "Hello World."
    echo "</HTML>"
    #!/usr/bin/perl
    print "Content-type: text/html\n";
    print "\n";
    print "<HTML>";
    print "Hello World.";
    print "</HTML>";
    #!/usr/bin/python3
    print ("Content-type: text/html\n")
    print ("\n")
    print ("<HTML>")
    print ("Hello World.")
    print ("</HTML>")

    Here are the UNIX commands to enable this script, located in the public_html/ directory of the user "jdoe":

      % cd ~jdoe/public_html
      % chmod 755 hello.cgi
      % chown jdoe hello.cgi
      % ls -al hello.cgi
      -rwxr-xr-x   1 jdoe users   7682 Dec  1 10:10 hello.cgi
    

    The URL to reach this CGI would be: http://inst.eecs.berkeley.edu/~jdoe/hello.cgi

    Also, you can execute that CGI program from within an html file by inserting the line:

    <!--#exec cgi="hello.cgi"-->
    

    Processing forms with CGI scripts:
    You can display a form on your WEB site and pass the user's data to a CGI program.   Here is an example of an HTML form and CGI program.

    Security with CGI scripts:

  • As a security precaution, Apache denies access to any directories or executable WEB files that are group-writable or world-writable (see http://inst.eecs.berkeley.edu/~inst/php-suexec.cgi).
  • The script will only have access to files that the owner can access, and the owner must take care that there are no security risks in the script.
  • A user's CGI program may not violate any of the rules of usage, such as by allowing other users to run programs or access files on the Instructional computers.

    Debugging CGI scripts:
    You cannot login directly to the inst.eecs.berkeley.edu WEB server, so if you need to debug a problem with a CGI program:

  • Look in error log files ( http://inst.eecs.berkeley.edu/logs) for references to your URL.
  • Test the program on a computer that runs the same operating system as the WEB server. inst.eecs.berkeley.edu is running Ubuntu Linux See /share/b/bin/clients for a list of servers that you can login to that have the same operating system. If your program can function as a stand-alone program, then it should also function through the UNIX WEB server. It has to produce valid html output, of course, in order for the WEB server to interpret it.

    If you are writing your scripts in Perl, please use /usr/bin/perl, so it is the same version that you are using where you are testing it.

    The CGI program will run with the permissions of the owner of the account through which it is accessed. All the files that the WEB server reads or runs must be world-readable or world-executable, since the WEB server runs as a generic unprivileged user. For a way to prevent local users from reading your WEB files, see Restricting Access to your WEB Site below.

    There are security risks in running CGI scripts. For example, there was a security advisory for a guestbook CGI script about a hole that will allow anyone to run any command in your account as you. (You can prevent that by not allowing people to enter HTML messages, by turning off $allow_html in the script.)

    Symbolic links:

    This server will only follow symbolic links to UNIX files that are owned by the owner of the sym link.

    Redirecting WEB pages:

    You can cause your WEB site to automatically redirect to a different site in two ways:

    1. Use a META refresh command in your index.html file, for example:
        <HTML>
        <META HTTP-EQUIV="Refresh" CONTENT="5;url=https://inst.eecs.berkeley.edu/~inst/SSLonly/index.html">
        <HTML>
        This site will jump to a new site in 5 seconds.  
      A benefit of this method is to display a timed message, warning people to update their bookmarks, etc.

    2. Use a rewrite rule in your .htaccess file, for example:
        RewriteEngine On
        RewriteBase   /~inst
        RewriteRule   ^(.*)      http://foo.com/~bar/$1      [R,L] =permanent 
      This would rewrite any URL such as http://inst.eecs/~inst/somefile.html to be http://foo.com/~bar/somefile.html, regardless of what the "somefile.html" part of the URL is. This means that users can type any URL within your site and get through, which is not true with the META refresh method.

      More info on this is in the Apache docs under mod_rewrite.


    Restricting Access to your WEB Site

    You can restrict the access to files in your WEB site:
    1. by computer    (list the authorized computers in .htaccess)
    2. by user    (list the users and passwords in .htpasswd)
    3. by file permission    (use a CGI program to access files that are not world-readable)
    4. by SSL    (use the SSLRequireSSL directive in .htaccess; can also use .htpasswd)
    These methods are described below.

    For more information about adding access control individual subdirectories, please see http://inst.eecs.berkeley.edu/manual/howto/auth.html
    http://inst.eecs.berkeley.edu/manual/howto/htaccess.html

    Allow access only to certain computers:

    To allow access only to certain computers, create the file .htaccess in the desired directory under your ~/public_html directory. Access to all files in that directory will be controlled by the .htaccess file.

    Here is an example of UNIX commands to control access by computer to all the files in a directory called "restricted". Access is resricted to the CS and EECS subnets and a single computer on the HIP subnet (136.152.91). A computer called "transcend.cs" is also excluded.

      UNIX command Purpose
    1.
      mkdir ~/public_html/restricted
    create the subdirectory
    2.
      cat  > ~/public_html/restricted/.htaccess << EOF
      <Limit GET>
      order allow,deny
      allow from cs.berkeley.edu eecs.berkeley.edu 136.152.91.1
      deny from transcend.cs.berkeley.edu 
      </Limit>
      EOF
    create the .htaccess file
    3.
      chmod ugo=x,u+rw ~/public_html/restricted
      chmod ugo=r,u+rw ~/public_html/restricted/.htaccess

    cd ~/public_html ls -lad restricted drwx--x--x 2 mylogin mygroup 5120 Feb 13 17:06 restricted

    cd ~/public_html/restricted ls -la .htaccess -rw-r--r-- 2 mylogin mygroup 5120 Feb 13 17:06 .htaccess

    set permissions, check the results

    Access to all files in the "restricted" directory will be limited to the entries in .htaccess. Files in the directory should be readable by everyone on the local computer. For example, for a file called "private.html" in the ~/public_html/restricted directory, set the permissions using:

    chmod ugo=rx,u+w ~/public_html/restricted/private.html
    
    Access will be restricted to browsers being run on the "cs.berkeley.edu" and "eecs.berkeley.edu" subnets and to the computer at address 136.152.91.1.

    Note that, to allow the Web server to read your files, the files in ~/public_html/restricted will be readable by anyone on any computer that can access your home directory. This is true for all of your WWW-accessible files.

    Allow access only to certain people:

    You can add password protection to a WEB site by creating a file called .htpasswd in the subdirectory that contains the WEB page (under your ~/public_html directory).

    Here is an example of UNIX commands to set up access controlled by password to all the files in a directory called "restricted".

      UNIX command Purpose
    1.
      mkdir ~/public_html/restricted
    create the subdirectory
    2.
      /share/b/bin/passwd2crypt
    create an encrypted password
    3.
      cat  > ~/public_html/restricted/.htpasswd << EOF
      user1:{encrypted_passwd}
      user2:{encrypted_passwd}
      EOF
      
    create the .htpasswd file
    4.
      cat  > ~/public_html/restricted/.htaccess << EOF
      AuthType Basic
      AuthName "My Restricted WEB site"
      AuthUserFile /{full path to your home dir}/public_html/restricted/.htpasswd
      Require valid-user
      EOF
    create the .htaccess file
    5.
      chmod ugo=x,u+rw ~/public_html/restricted
      chmod ugo=r,u+rw ~/public_html/restricted/.htaccess
      chmod ugo=r,u+rw ~/public_html/restricted/.htpasswd

    cd ~/public_html ls -lad restricted drwx--x--x 2 mylogin mygroup 5120 Feb 13 17:06 restricted

    cd ~/public_html/restricted ls -la .htaccess .htpasswd -rw-r--r-- 2 mylogin mygroup 5120 Feb 13 17:06 .htaccess -rw-r--r-- 2 mylogin mygroup 5120 Feb 13 17:06 .htpasswd

    set permissions, check the results

    The {encrypted_passwd} can be generated using the program /share/b/bin/passwd2crypt (on the Instructional UNIX systems) or our .htpasswd File Generator.

    WEB browser users will be prompted for a password if they access the directory, and only the users listed in .htpasswd will be able to read any of the files in the directory.

    Note that, to allow the Web server to read your files, the files in ~/public_html/restricted will be readable by anyone on any computer that can access your home directory. This is true for all of your WWW-accessible files. For a way around this, see below, Using a CGI script to restrict access by UNIX file permissions.

    Limitations of using .htaccess and .htpasswd files:

  • .htaccess and .htpasswd must be world readable, or the WEB server will not be able to read them and will restrict the directory to everyone. The Instructional WEB server runs as an unprivileged user.

  • Your ~/public_html directory and all files under it must be world-readable for the Instructional WEB server to be able to read them. This means that files that you restrict for access via the WEB may still be readable by everyone who can login to the Instructional computers. For a way around this, see below, Using a CGI script to restrict access by UNIX file permissions.

  • If there is an error in your .htaccess or .htpasswd, then access to your webpage may be prevented until the error is corrected.

  • .htaccess and .htpasswd restrict the directory they are located in, and all subdirectories under that. This means that if you have a broken .htaccess or .htpasswd file and you move it from your public_html directory to your home directory, your webpage will still be unviewable.

    Using a CGI script to restrict access by UNIX file permissions:

  • A CGI script can be used to allow the WEB server to read files that are not world-readable. Our WEB server runs CGI programs with the permissions of the user (ie ~user), so the CGI program can read and write to files that are read/writable only by the user.

  • We have developed a CGI program called "restrict-dir.cgi" that allows you to display a Login button an give WEB access for authorized users to files that are not world-readable on our shared UNIX login systems or to users via the WEB. Please ask inst@inst.eecs.berkeley.edu for help with it if needed.

  • You can use an HTML redirect command (in a world-readable file) to run a CGI program (read/writable only by the user) that reads and writes to other files (also read/writable only by the user). The CGI program would generate the HTML pages and forms, process any input and display any output desired. Here is an example of an index.html file that redirects to a CGI program in the same subdirectory: <HTML> <META HTTP-EQUIV="Refresh" CONTENT="0;url=maybePERL.cgi"> <HEAD> <TITLE>Auto-Redirect to My CGI program (maybePERL.cgi)</TITLE> </HEAD> </HTML> Assuming the CGI program wants to display or write to a file called file1.txt, here are the UNIX commands to set the permissions on all these files:
    	chmod 755 index.html	(executable, so WEB server can run 'include's)
    	chmod 700 maybePERL.cgi	(only readable/executable by the owner)
    	chmod 600 file1.txt	(only readable/writeable by the owner)
    
    This will allow users to read the files through your WEB site, and you can limit them by prompting for a password from your CGI program. But users who are logged in directly onto our UNIX computers (such as cory.eecs) will not be able to read the files.

    Security using SSL:

  • When using .htpasswd, passwords typed over the network are in clear-text and are vulnerable to password sniffing over the net. We have installed a WEB server that uses SSL encryption and authentication, and this protects the passwords and other data that is transfered between the server and your browser.

    The server is https://inst.eecs.berkeley.edu.

  • SSL is a method of software ecryption that protects data tranfered over the network between your WEB browser and a WEB server. Not all WEB browsers and servers are enabled for SSL. You will know if your browser is connected securely via SSL if the little picture of the padlock on the frame of you browser turns yellow and shows a locked padlock.

  • You can ensure that a WEB page of yours is accessed only through our SSL-enabled server by adding this on a line in .htaccess:
    	SSLRequireSSL 
    Users who try to access any files in that directory through one of our non-SSL unabled servers will get an "access denied" error.

  • Finally, the best security is to require access using a password that is safely encrypted through our SSL-enabled server. Here is a sample .htaccess file for that:
    	AuthType Basic
    	AuthName "access is restricted to users on my list using SSL"
    	AuthUserFile  /home/aa/staff/inst/authorized-web-users
    	Require valid-user
    	SSLRequireSSL 
    The AuthUserFile needs to be readable by the Apache WEB server. The last 2 lines make it ask for a password and require an SSL browser. Reference:
    https://httpd.apache.org/docs/2.4/howto/htaccess.html.
    Instructions for creating the .htpasswd are in the Allow access only to certain people section, above.


    Usage Policies for Information Servers

    "Informed Consent" Required for Displaying Student Identities

    Information that you display publically via University computers may not include the names of a student without an "informed consent" from the student. Restricting access to WEB pages, say to the EECS or BERKELEY.EDU domains, is not sufficient: informed consent is still required. This is a requirement by federal law. An example of "informed consent" is:
    "I, (student name), consent to have my name posted on (webpage title & url), a paper copy template of which is attached to this consent form. My name may be posted on this webpage from (date) to (date). I understand that my consent to have my name posted on this webpage is not a condition of my participation in (name of the class), nor will it be used as a basis for grading my performance therein." Please refer to the Policy Analysts at the Office of the Registrar, 127 Sproul Hall, for further clarification about the requirement for "informed consent".

    Other topics:

  • no obscenity or offensive remarks of a racial or sexual nature
  • no advertisements of commercial products or services


    General References about WWW utilities

    These are public documents that have more about the WWW and the HTML language used in writing home pages (these may not always be available):
  • Web Design with Style, Ease, and CSS
  • about the World Wide Web
  • UNICODE character charts
  • UNICODE character converter
  • The CGI Resource Index
  • Security Issues in Perl Scripts   by Jordan Dimov
  •    
  • Javascript Tutorials
  • CSS Tutorial (w3schools)
  • cheat sheet (Vistaprint)
  • CSS: Button Stylesheet Wizard
  • HTML Hexadecimal Color Chart
  • W3C Guide to Style sheets   [Tips & Tricks]
  • CSS Tutorials (engineeringdegree.net)  


  • Last modified:
    inst@eecs.berkeley.edu