Using Git in CS164

Git is a distributed version-control system that has become increasingly popular in the open-source community. Developers within a team (or in our case, a class) each work on separate repositories, and may from time to time synchronize all or part of the the contents of their repository with one or more other repositories. There need be no central repository, in fact.

This document documents a minimal set of commands for using Git in this course to submit assignments and acquire skeleton files. It is not any kind of tutorial or introduction to Git. Consult this Git documentation for an overview of Git and details of its various commands.

Preliminaries

If you are working from your own machine, be sure to install Git, if it is not already installed. There are downloads available for various systems on this site.

Next, install the appropriate ssh private key for access our repositories that live on the instructional machines. First, get an instructional account for CS164 if you don't already have one. There, you'll find a directory ~/.ssh, with files id_rsa and id_rsa.pub, which contain, respectively, the private and public ssh keys used by our repositories. For non-Windows systems, copy id_rsa to your home computer's .ssh directory, giving it a unique name other than id_rsa (for example, cs164_id_rsa.) Otherwise, you are liable to overwrite a secret key file of that name that you might have created for your own purposes. You can get ssh to use this key when appropriate by adding a line

IdentityFile ~/.ssh/cs164_id_rsa
to the file .ssh/config (creating that file if it does not already exist.)

In what follows, we'll consider a student named Fred with login cs164-xx belonging to team OurTeam. Having installed Git, Fred first performs some general configuration that will apply to all repositories used from his account (for this course or elsewhere):

    $ git config --global user.name "Fred Student"
    $ git config --global user.email "fred.student@somemail.com"
    $ git config --global push.default simple
The first two lines set the name and email that Git will record in commits and logs. The last line is a safety measure that affects the git push command described later.

Setting up Repositories

Git terminology uses the term repository to mean an organized collection of versions (called commits) of a directory structure; plus a checked-out copy of the one of those commits (a working directory), possibly in the process of modification; plus a staging area (called the index) used to build another commit. Usually, the set of commits and the index are stored in a directory named .git at the top level of the working directory. The term bare repository refers to a directory containing only the set of commits (what would be a .git directory in an ordinary repository, but with no index). Typically, we use bare repositories as central copies of versions that will be shared by several repositories.

Each student and each team in this class has a bare Git repository in which to develop and submit assignments. More specifically, we provide a set of bare repositories under the cs164-ta account, which authorized students may clone, pull from, or push to as desired. You are each authorized to access your own repository and that of your team.

Fred establishes a working directory containing a local copy of his private repository in a directory (let's say ~/cs164-repo) on his home and/or instructional account with the command

    $ git clone cs164-ta@torus.cs.berkeley.edu:users/cs164-xx cs164-repo
This will create Fred's personal bare repository on cs164-ta (if necessary) and copy its contents into the new local working directory cs164-repo as cs164-repo/.git. If there is a head version in that repository (as will happen when Fred creates a second local repository after having committed a few versions), it will be checked out to form the initial contents of the working directory (which is otherwise empty). Fred can use cs164-repo for one-person assignments: generally homework.

There will be various resources that we provide, including skeleton files for projects and assignments. Fred can add a reference to these resources to his repository with the commands

    $ cd ~/cs164-repo
    $ git remote add shared cs164-ta@torus.cs.berkeley.edu:shared 
We'll see how to use this remote reference later.

Once Fred has a team, he can also access his team's repository by setting up another local copy, such as

    $ cd 
    $ git clone cs164-ta@torus.cs.berkeley.edu:teams/OurTeam team-repo
    $ cd team-repo
    $ git remote add shared cs164-ta@torus.cs.berkeley.edu:shared 
Again, Fred adds a link to shared code for later use.

Using Your Repository

Keep each assignment or project, ASSGN, in a subdirectory of that name in your working directory. Typically, we provide an initial set of files for each assignment. Fred can initialize his own assignment directory, say for hw3, like this:

    $ cd ~/cs164-repo
    $ git fetch shared
    $ git checkout -b hw3 shared/hw3
    $ git push -u origin hw3
This fetches the staff's hw3 skeleton files from cs164-ta, then checks out a copy of that hw3 into his local repository as a new Git branch named hw3. Finally, it copies that branch back to his bare repository on cs164-ta and arranges to track it locally. That means that if he works on hw3 from two different local repositories (say from home and on the instructional machines), he can bring his local copy up to date with any changes he's made from some other local repository with the command
    $ git pull --rebase

Work on hw3 now proceeds as a sequence of edits and commits. After editing, adding, and deleting files, Fred first informs Git of new any new files that it should start tracking. For example, if when working on hw3, Fred creates files test1.inp and test1.out, he would use the command

    $ git add test1.inp test1.out
(from inside the directory ~/cs164-repo/hw3). Or, if these files are stored in a new subdirectory called hw3/testing, he can use the command
    $ git add testing
Once he adds any new files, he can create a new commit for hw3 with
    $ git commit -a
This will prompt him to write a log entry for the new commit. Descriptive log entries are generally a good idea, especially for complex team projects where one is trying to keep each other informed of what changes made and why.

Periodically, Fred will want to transmit his work to the personal or team repository on cs164-ta that he cloned his local repository from. This is especially true when he intends to hand it in, share it with other team members, or make further edits from a different local repository. After the initial push for hw3 (the one that had -u in it) the command to do this (for our hw3 example) is just

    $ git push
which, since Fred has used the procedures described in this document for configuration and for creating assignments, will by default push the current branch (e.g., hw3) to the remote repository that it is tracking (your repository on cs164-ta). He can also write it out more explicitly as
    $ git push origin hw3

Don't do this, however, without first committing any outstanding changes. Git's distributed nature means that you can create an arbitrarily long sequence of commits before pushing them. It's not necessary to be connected to the cs164-ta repositories (or indeed, the Internet) to use Git's version-control features.

Submitting Your Work

The staff does not immediately see changes to your local repositories. That is, when you modify, add, or delete a file or when you execute git commit, we do not see these changes, since your repository under cs164-ta is not changed. To be seen by us (or your teammates), your commits must be pushed, as described in the preceding section.

Furthermore, we don't treat all your commits, even when pushed, as submissions until you mark them as such. To submit one of your committed versions, create (and subsequently push) an appropriately named tag. For example, when Fred first wants to submit hw3, then after committing any changes in his hw3 directory, he can do this:

    $ git tag hw3-1
Submission is not complete until he pushes the work to us:
    $ git push         # To push the hw3 branch (if not yet done)
    $ git push --tags  # To push hw3-1 (and any other tags)

Subsequent submissions should be named hw3-2, hw3-3, etc. We take the highest-numbered tag as Fred's final submission. He can submit at any time, even when he has many intervening commits. For example, if he has submitted hw3-1 and hw3-2 and decides that the last submission is bogus, and the first one was better, he can execute

    $ git tag hw3-3 hw3-1
which makes hw3-3, the latest submission, a synonym for hw3-1. Alternatively, if the commit you want to submit was not previously tagged, Fred can find its unique id using git log and then tag that. For example, he might see
 
    $ git log
    commit ff39e11f5e292a0c81f3cb65c2a39c7b301a595a
    Author: Fred Student 
    Date:   Tue Jan 27 16:32:17 2015 -0800

        Experimentally refactor my solution to problem 3.

    commit 4f7d9e65744c8b528289746bf911cb81ded7c5e2
    Author: Fred Student 
    Date:   Wed Jan 26 15:36:28 2015 -0800

        Add tests.
        No errors detected so far.

    commit 2aea9782d7000bb07277617b9f81bea485374d27
    Author: Fred Student 
    Date:   Wed Jan 22 15:34:55 2015 -0800

        Begin work one hw3.
Now to submit the second commit back (from 1/26) as his first submission, he could execute
    $ git tag hw3-1 4f7d9e
(The unique ids in Git are hexadecimal SHA-1 hashcodes of the contents of the commits. You only need to specify a sufficiently long prefix of the hashcode to uniquely identify which commit you mean.)

Again, after adding any new tags, Fred must use git push --tags to push them to the repository that the staff (and autograder) see.

Submission dates and times will be taken from the time of the commit tagged by hw3-n, and not from the time the tag was created.

You can delete a tag locally, but we have set up the repository to prevent you from doing this on cs164-ta's repositories. It shouldn't be necessary in any case, since the autograder will ignore tags that don't refer to known assignments and you can always supercede a tag with a higher-numbered one.

Using a Common Branch

The instructions so far have assumed that you store each assignment in a separate branch. This has the advantage of keeping your working directory uncluttered. Should you want to look at an old assignment, you can do so by checking it out. For example:

    git checkout hw1
will switch you to the hw1 branch, so that your working directory will contain just the directory hw1.

However, you may not mind the clutter and might prefer to keep all your assignments checked out so that you can refer to them easily. In that case, you can simply use the standard branch master for everything. Our software doesn't mind as long as each assignment is in its own eponymous directory.

It will work best if you start off by creating the master branch from the standard empty commit defined supplied in the shared repository. Assuming you have executed the recommended git remote command (see Setting up Repositories), initialize the branch with with these commands:

    git checkout -b master Empty
    git push -u origin master

To start a new homework assignment from our skeleton, change the initialization instructions given under Using Your Repository to the following (after first making sure to commit any of your current work, of course):

    git merge -m "Start hw3" shared/hw3
This will add the subdirectory hw3 to your working directory.

Quick Summary

These commands assume you have account cs164-xx and team OurTeam.
  1. To initialize Git on a particular system:
        $ git config --global user.name "Fred Student"
        $ git config --global user.email "fred.student@somemail.com"
        $ git config --global push.default simple      # Suggested
    
  2. To create a local copy of your personal repository in directory cs164-repo and connect it up our shared repository:
        $ git clone cs164-ta@torus.cs.berkeley.edu:users/cs164-xx cs164-repo
        $ cd cs164-repo
        $ git remote add shared cs164-ta@torus.cs.berkeley.edu:shared 
    
  3. To create a local copy of your team's repository in directory team-repo and connect it up our shared repository:
        $ git clone cs164-ta@torus.cs.berkeley.edu:teams/OurTeam team-repo
        $ cd team-repo
        $ git remote add shared cs164-ta@torus.cs.berkeley.edu:shared 
    
  4. To start an assignment named ASSGN (e.g., hw3), from our template and put it in its own branch with that name:
        $ cd cs164-repo        # If not already there
        $ git checkout -b ASSGN shared/ASSGN
        $ git push -u origin ASSGN
    
  5. To see the current status of a repository, including files that have been added, removed, or modified; files that are in the working directory, but not in the current commit ("untracked"); and discrepencies between the current branch and the remote branch it is tracking (gets pushed to or pulled from):
        $ git status
    
    The message will tell you how to undo changes from the last commit, should you want to.
  6. To start tracking a file or directory F, so that it will be added to the repository on the next commit:
        $ git add F
    
  7. To commit modifications to all tracked files in the local repository:
        $ git commit -a
    
    This does nothing with untracked files.
  8. To transmit commits on the current branch to the remote (cs164-ta) repository:
        $ git push
    
  9. To fetch new commits for the current branch in the cs164-ta repository that have been pushed from another local directory (commit current work first):
        $ git pull --rebase
    
  10. To submit assignment ASSGN (assuming it is in the current branch):
        $ git tag ASSGN-n
        $ git push
        $ git push --tags
    
    where n is a sequence number larger than those of existing tags.
  11. To see tags that you have created (not necessarily pushed):
        $ git tag