Using Git in CS61B
Author: P. N. Hilfinger

A. Introduction

Git is a distributed version-control system that has become the norm in the open-source software community. Developers within a team (or in our case, a class) each work on separate repositories, and may from time to time synchronize all or part of the contents of their repositories with one or more other repositories. There need be no central repository, in fact.

This document describes a minimal set of commands for using Git in this course to submit assignments and acquire skeleton files. It is not any kind of tutorial or introduction to Git. Consult this brief introduction and this Git documentation for an overview of Git and details of its various commands.

B. Preliminaries

If you are working from your own machine, be sure to install Git, if it is not already installed. If you followed the lab1b setup, you should have Git already installed.

Git terminology uses the term repository to mean an organized collection of versions (called commits) of a directory structure, a checked-out copy of the one of those commits (a working directory) possibly in the process of modification, and a staging area (called the index) used to track what goes into the next commit. Usually, the set of commits and the index are stored in a directory named .git at the top level of the working directory. The term bare repository refers to a directory containing only the set of commits (what would be a .git directory in an ordinary repository, but with no index). Typically, we use bare repositories as central copies of versions that will be shared by several repositories.

Each student has a bare central repository that we maintain for you on the instructional servers under the account cs61b-taa. You, the staff, and the autograding software all have access to this repository. In addition, we've set up for you one local repository under your cs61b instructional account in a directory called repo. It is a clone of your central repository, together with a checked-out working directory from one of its commits (generally your latest). You are free to set up other such local clones (say on your personal laptop or home computer). The central repository will serve to keep them all in sync with each other (at least if you follow the instructions here), so that you can work on any of several machines.

C. Setting up a Local Repository

Regardless of if you plan on using your local computer or a lab computer, you must do this section. This part sets up a local repository on your instructional account. Open up the terminal on the lab computer (or ssh in from your personal computer) and type:

  init-git-repo

This invokes a script that will create a clone of a central repository for your work (one that we and the autograder share).

Don't worry if you get this error, it's perfectly fine and you can continue:

  error: pathspec 'master' did not match any file(s) known to git.
  Error: Could not checkout master branch.  Trying to create it.

Non-Lab Computers

If you are only planning to use a lab computer, you may skip this part. However, if you plan to be using a non-lab computer, then you must first install the appropriate ssh private key for access to your central repository. Don't worry about what it is for now.

We highly recommend that you read section D of the instructional accounts guide if you are confused with the commands.

First, get on your instructional account by ssh-ing into the account via your local computer. Once you are in your instructional account, type:

cd .ssh

Then, do the command:

ls

and you should see two files id_rsa and id_rsa.pub, which contain, respectively, the private and public ssh keys used by our repositories.

You can view the contents of this file from the terminal by using the command:

cat id_rsa

Highlight the contents of id_rsa (be sure to highlight the -----BEGIN RSA PRIVATE KEY----- and -----END RSA PRIVATE KEY----- lines as well), and copy them.

Next, go to your personal computer's ~/.ssh directory (NOT the instructional account), and create a new file called cs61b_id_rsa (do not override your own id_rsa file). You can do this all on the command line of your personal computer by running:

touch cs61b_id_rsa
vim cs61b_id_rsa

(Type i to insert)
(Right-click and paste)
(Press Esc)
(Type :wq and Enter to save and quit)

If you cannot see your ~/.ssh directory, you might have hidden files active. To fix this issue, you should Google how to show hidden files (this will be an exercise on how to solve problems on your own!).

To use the key when ssh-ing, open on your local computer the file ~/.ssh/config (or create a file with that name if it does not already exist) and add the line:

IdentityFile ~/.ssh/cs61b_id_rsa

Finally, you must correctly set your cs61b_id_rsa file permissions by using the command:

chmod 400 ~/.ssh/cs61b_id_rsa

Otherwise you will get the WARNING! UNPROTECTED PRIVATE KEY FILE! later.

Now, continue below under the sub-section that matches your operating system.

Mac or Linux

If you plan to be using a non-lab computer that is either a Mac or Linux machine, then you should set up a local repository on it according to the following section.

If you haven't already, first install the appropriate ssh private key for access to your central repository according to the "Non-Lab Computers" sub-section above.

We've packaged the rest of the setup described here in a Python3 script called remote-init-git-repo. You can access the script here. Create a new file called remote-init-git-repo on your personal computer, then copy and paste the contents of the script into your file. Then, run:

python3 DIR/remote-init-git-repo

(where DIR is where the file lives on your personal computer).

Alternatively, this script also resides on the instructional servers at ~cs61b/bin/other/remote-init-git-repo.

You should read the corresponding section of the Windows setup below to understand what this script is doing.

If you encounter the error when running the script:

Repo already exists

then type the command:

rm -rf ~/repo

Then you can rerun the script and it should work.

But be careful with this remove command in general (be sure not to misspell ~/repo). It will permanently delete your files.

Windows

If you plan to be using a non-lab computer that is Windows machine, then you should set up a local repository on it according to the following section.

If you haven't already, first install the appropriate ssh private key for access to your central repository according to the "Non-Lab Computers" sub-section above.

Next, we will configure Git as if we are a student named Fred with login cs61b-***. Having installed Git, Fred first performs some general configuration that will apply to all repositories used from his account (for this course or elsewhere):

git config --global user.name "Fred Student"
git config --global user.email "fred.student@somemail.com"
git config --global push.default simple

The first two lines set the name and email that Git will record in commits and logs. The last line is a safety measure that affects the git push command described later.

Fred initially establishes a working directory containing a local clone of his central repository in a directory ~/repo (actually, any name works; repo is the name we use in your instructional account):

cd
git clone cs61b-taa@derby.cs.berkeley.edu:students/cs61b-*** repo

This will copy the contents of Fred's personal bare repository on cs61b-taa into the new local working directory repo as repo/.git, and will then check out its head version into repo as well. Initially, this head version is the branch master and is empty.

There will be various resources that we provide, including skeleton (starter) files for projects and assignments. Fred can add a reference to these resources to his repository with the commands

cd repo
git remote add shared cs61b-taa@derby.cs.berkeley.edu:shared 

We'll see how to use this remote reference later.

D. Using Your Repository

Keep each assignment or project, ASSGN, in a subdirectory of that name in your working directory. Typically, we provide an initial set of files for each assignment. You can initialize an assignment directory, say for hw3, like this:

cd repo              # If not already there
git fetch shared     # Fetch current copy of skeleton files
git merge -m "Start hw3" shared/hw3
                     # Add or update your master directory from
                     # our remote version of hw3.
git push             # Save your local updated master directory 
                     # to your central repository.

Here, shared/hw3 is a remote branch containing a copy of the hw3 subdirectory from the staff repository (which we maintain).
The merge command combines the contents of this branch with the contents of your working directory, which in our case will add a directory called hw3. Later, if the staff makes changes to the skeleton after you have done this initial merge, you can use essentially the same sequence:

git fetch shared     # Fetch current copy of skeleton files
git merge -m "Get updates to hw3 skeleton" shared/hw3
git push

to add these changes to your files.

Work on hw3 now proceeds as a sequence of edits and commits. After editing, adding, and deleting files, you first inform Git of any new files that it should start tracking. For example, if when working on hw3, you create files test1.inp and test1.out, you would use the command

git add test1.inp test1.out

(from inside the directory ~/repo/hw3). Or, if these files are stored in a new subdirectory called hw3/testing, you can use the command

git add testing

to add all the files in the testing directory. Once you add any new files, you can create a new commit (snapshot) for hw3 with

git commit -m "ADD YOUR COMMIT MESSAGE HERE"

You should replace the text inside the quotation marks with a commit message for the new commit. Descriptive commit messages are generally a good idea, since they help you identify commits when using the git log command to list the commits you have made. In later courses (and real life), they are especially useful for complex team projects where one is trying to keep other team members informed of what changes you've made and why.

Before doing either of the git merge commands above (either to start or update an assignment), be sure that you use git commit, since you won't otherwise be able to commit.

Before performing a git commit, it's a good idea to make sure that all your files are accounted for. The command

git status

will indicate any files that are untracked, meaning that git commit will pay no attention to them and will not save or update them. Generally, we suggest that you use git add on these files (or get rid of them entirely if they are unneeded) before committing. This way you avoid the annoying (and, alas, rather common) problem of thinking that you have submitted a file when you have not.

Files that are being tracked and have been changed must also be subjected to git add before committing; otherwise, the changes will not be committed. The git status command will tell which files have "changes not staged for commit", and that therefore should be added. Generally, however, I find it more convenient to use the command

 git commit -a -m ...

which will first add all these unstaged commits and then commit them. This does nothing with untracked files, so you will still need to check for them with git status and git add them.

Periodically, you will want to transmit your work to your central repository on cs61b-taa (from which your local repository was cloned). This is especially true when you intend to hand it in or do further work on it from a different local repository. Also, pushing to the central repository provides you an additional backup of your work—one that you cannot accidentally erase. The command to push to your central repository is just

git push

which (assuming you've used the procedures described in this document for configuration and for creating assignments) will by default push your master branch and everything committed to it to the central repository. Don't try to push, however, without first committing any outstanding changes.

Git's distributed nature means that you can create an arbitrarily long sequence of commits before pushing them. It's not necessary to be connected to the cs61b-taa repositories (or indeed, the Internet) to use Git's version-control features. We've been suggesting that you execute

git push

after merges, but in fact you can delay this until you wish to submit your assignment or until you think you might need to transfer your work to another of your local repositories. Still, it is wise to use the push command with some regularity, since it provides an extra backup copy of your work on your central repository.

If you work on hw3 from two different local repositories (say from home and on the instructional machines), then (if you have used git push to push your work from one local repository to your central repository) you can bring the other local repository up to date with any changes you made with the command

git pull

(after first committing anything you've done to this local repository).

One last thing. Periodically, you will run into merge conflicts. For more information on merge conflicts themselves and how to resolve them, please read this documentation.

E. Submitting Your Work

The staff does not immediately see changes to your local repositories. That is, when you modify, add, or delete a file or when you execute git commit, we do not see these changes, since your central repository under cs61b-taa is not changed. To be seen by us (or our testing software), your commits must be pushed as described in the preceding section.

Furthermore, we don't treat all your commits, even when pushed, as submissions until you mark them as such. To submit one of your committed versions, create (and subsequently push) an appropriately named tag. For example, when you first want to submit hw3, first commit any changes in your hw3 directory, and then do this:

git tag hw3-1

A tag is a named reference to a particular commit. After using git tag on a commit, you can later check out that commit by name check what you committed. For example,

git checkout hw3-1

(after examining the commit at this tag, do be sure to git checkout master in order to get back to your development branch. Otherwise, you will create great confusion for yourself.)

Submission is not complete until you push the work to us:

git push         # To push the hw3 branch (if not yet done)
git push --tags  # To push hw3-1 (and any other tags)

Subsequent submissions should be named hw3-2, hw3-3, etc. We take the highest-numbered tag as your final submission. You can submit at any time, even when you have many intervening commits. For example, if you have submitted hw3-1 and hw3-2 and decide that the last submission is bogus, and the first one was better, you can execute

git tag hw3-3 hw3-1

which makes hw3-3, the latest submission, as a synonym for hw3-1. In fact, if the commit you want to submit was not previously tagged, you can find its unique id using git log and then tag that. For example, you might see

git log
commit ff39e11f5e292a0c81f3cb65c2a39c7b301a595a
Author: Fred Student <fred.student@somemail.com>
Date:   Tue Jan 27 16:32:17 2015 -0800

    Experimentally refactor my solution to problem 3.

commit 4f7d9e65744c8b528289746bf911cb81ded7c5e2
Author: Fred Student <fred.student@somemail.com>
Date:   Wed Jan 26 15:36:28 2015 -0800

    Add tests.
    No errors detected so far.

commit 2aea9782d7000bb07277617b9f81bea485374d27
Author: Fred Student <fred.student@somemail.com>
Date:   Wed Jan 22 15:34:55 2015 -0800

    Begin work one hw3.

Now to submit the second commit back (from 1/26) as your first submission, execute

git tag hw3-1 4f7d9e

(The unique ids in Git are hexadecimal SHA-1 hashcodes of the contents of the commits. You only need to specify a sufficiently long prefix of the hashcode to uniquely identify which commit you mean.)

Again, after adding any new tags, you must use git push --tags to push them to the repository that the staff (and autograder) see.

Submission dates and times will be taken from the time of the commit tagged by hw3-n, and not from the time you created the tag.
While it is possible to delete tags, it shouldn't be necessary, since the autograder will ignore tags that don't refer to known assignments and you can always supersede a tag with a higher-numbered one.

F. Quick Summary

These commands assume you have account cs61b-***.

  1. To initialize Git on a particular system: This is already done on the instructional machines.

    git config --global user.name "Fred Student"
    git config --global user.email "fred.student@somemail.com"
    git config --global push.default simple      # Suggested
  2. To create a local copy of your personal repository in directory repo and connect it up our shared repository:

    git clone cs61b-taa@derby.cs.berkeley.edu:students/cs61b-*** repo
    cd repo
    git remote add shared cs61b-taa@derby.cs.berkeley.edu:shared 
  3. To start an assignment named ASSGN (e.g., hw3), from our skeleton, first make sure all your local work is committed, and then use

    cd repo              # If not already there
    git fetch shared     # Fetch current copy of skeleton files
    git merge -m "Start assignment ASSGN" shared/ASSGN
    git push
  4. To see the current status of a repository, including files that have been added, removed, or modified; files that are in the working directory, but not in the current commit ("untracked"); and discrepancies between the current branch and the remote branch it is tracking (gets pushed to or pulled from):

     git status

    The message will tell you how to undo changes from the last commit, should you want to.

  5. To start tracking a file (or directory) Foo.java, so that it will be added to the repository on the next commit:

     git add Foo.java
  6. To commit modifications to all tracked files in the local repository:

    git commit -a

    This does nothing with untracked files.

  7. To transmit commits on the current branch to the remote (cs61b-taa) repository:

    git push
  8. To fetch new commits from the cs61b-taa repository that have been pushed from another local directory (commit current work first):

    git pull
  9. To submit assignment ASSGN, make sure everything you want is committed and then execute

    git tag ASSGN-n
    git push
    git push --tags

    where n is a sequence number larger than those of existing tags.

  10. To see tags that you have created (not necessarily pushed):

    git tag