Using Git in CS164

Git is a distributed version-control system that has become increasingly popular in the open-source community. Developers within a team (or in our case, a class) each work on separate repositories, and may from time to time synchronize all or part of the the contents of their repository with one or more other repositories. There need be no central repository, in fact.

This document documents a minimal set of commands for using Git in this course to submit assignments and acquire skeleton files. It is not any kind of tutorial or introduction to Git. Consult this Git documentation for an overview of Git and details of its various commands.

Preliminaries

If you are working from your own machine, be sure to install Git, if it is not already installed. There are downloads available for various systems on this site.

Next, install the appropriate ssh private key for access to our repositories that live on the instructional machines. First, get an instructional account for CS164 if you don't already have one. There, you'll find a directory ~/.ssh, with files id_rsa and id_rsa.pub, which contain, respectively, the private and public ssh keys used by our repositories. For non-Windows systems, copy id_rsa to your home computer's .ssh directory, giving it a unique name other than id_rsa (for example, cs164_id_rsa.) Otherwise, you are liable to overwrite a secret key file of that name that you might have created for your own purposes. You can get ssh to use this key when appropriate by adding a line

IdentityFile ~/.ssh/cs164_id_rsa
to the file .ssh/config (creating that file if it does not already exist.) WARNING: The instructions above do not apply to the instructional machines, where your ssh key is already set up.

In what follows, we'll consider a student named Fred with login cs164-xxx. Having installed Git, Fred first performs some general configuration that will apply to all repositories used from his account (for this course or elsewhere):

    $ git config --global user.name "Fred Student"
    $ git config --global user.email "fred.student@somemail.com"
    $ git config --global push.default simple
The first two lines set the name and email that Git will record in commits and logs. The last line is a safety measure that affects the git push command described later.

Setting up Repositories

Git terminology uses the term repository to mean an organized collection of versions (called commits) of a directory structure; plus a checked-out copy of the one of those commits (a working directory), possibly in the process of modification; plus a staging area (called the index) used to build another commit. Usually, the set of commits and the index are stored in a directory named .git at the top level of the working directory. The term bare repository refers to a directory containing only the set of commits (what would be a .git directory in an ordinary repository, but with no index). Typically, we use bare repositories as central copies of versions that will be shared by several repositories.

Each student and each team in this class has a bare Git repository in which to develop and submit assignments. More specifically, for homework, we provide a set of bare repositories under the cs164-taa account, which authorized students may clone, pull from, or push to as desired. Later, you will get a separate Github repository.

Fred establishes a working directory containing a local copy of his private repository in a directory (let's say ~/cs164-hw) on his home and/or instructional account with the command

    $ git clone cs164-taa@ashby.cs.berkeley.edu:users/cs164-xxx cs164-hw
This will create Fred's personal bare repository on cs164-taa (if necessary) and copy its contents into the new local working directory cs164-hw as cs164-hw/.git. If there is a head version in that repository (as will happen when Fred creates a second local repository after having committed a few versions), it will be checked out to form the initial contents of the working directory (which is otherwise empty). Fred can use cs164-hw for one-person assignments—generally homework.

There will be various resources that we provide, including skeleton files for projects and assignments. Fred can add a reference to these resources to his repository with the commands

    $ cd ~/cs164-hw
    $ git remote add shared cs164-taa@ashby.cs.berkeley.edu:shared 

When Fred clones his repository for the first time, he'll get a message to the effect that he has cloned an empty repository. We suggest that he start things off like this:

  $ git fetch shared
  $ git checkout -b master shared/Initial
  $ git push -u origin master
This has the effect of making sure that all branches, submissions, and repositories have the same commit as their ultimate ancestor. This command sequence should never be repeated here or in any other clone of this repository.

Using Your Repository

Keep each assignment, ASSGN, in a subdirectory of that name in your working Git directory. Typically, we provide an initial set of files for each assignment. Fred can initialize his own assignment directory, say for hw3, like this:

    $ cd ~/cs164-hw
    $ git fetch shared
    $ git merge shared/hw3 -m "Start assignment hw3."
    $ git push -u origin
This fetches the staff's hw3 skeleton files from cs164-taa, then merges a copy of that hw3 into his local repository on the master branch. Finally, it copies the updated master branch back to his bare repository on cs164-taa.

Work on hw3 now proceeds as a sequence of edits and commits. After editing, adding, and deleting files, Fred first informs Git of new any new files that it should start tracking. For example, if when working on hw3, Fred creates files test1.inp and test1.out, he would use the command

    $ git add test1.inp test1.out
(from inside the directory ~/cs164-hw/hw3). Or, if these files are stored in a new subdirectory called hw3/testing, he can use the command
    $ git add testing
Once he adds any new files, he can create a new commit with
    $ git commit -a
This will prompt him to write a log entry for the new commit. Descriptive log entries are generally a good idea, especially for complex team projects where one is trying to keep each other informed of what changes made and why.

Periodically, Fred will want to transmit his work to his personal repository on cs164-taa, from which he cloned his local repository. This is especially true when he intends to hand it in or make further edits from a different local repository. After the initial cloning of the repository, the command to do so is just

    $ git push
which, since Fred has used the procedures described in this document for configuration and for creating assignments, will by default push the current branch (typically, the master branch for Fred's personal repository) to the remote repository that it is tracking (his repository on cs164-taa). He can also write it out more explicitly as
    $ git push origin master

Don't push things, however, without first committing any outstanding changes. Git's distributed nature means that you can create an arbitrarily long sequence of commits before pushing them (but don't do this unless you have to: every moment you don't commit is another opportunity for your dog to urinate on your laptop and cause you to lose all your work.) It's not necessary to be connected to the cs164-taa repositories (or indeed, the Internet) to use Git's version-control features.

Submitting Your Work

The staff does not immediately see changes to your local repositories. That is, when you modify, add, or delete a file or when you execute git commit, we do not see these changes, since your repository under cs164-taa is not changed. To be seen by us, your commits must be pushed, as described in the preceding section.

Furthermore, we don't treat all your commits, even when pushed, as submissions until you mark them as such. To submit one of your committed versions, create (and subsequently push) an appropriately named tag. For example, when Fred first wants to submit hw3, then after committing any changes in his hw3 directory, he can do this:

    $ git tag hw3-1
Submission is not complete until he pushes the work to us:
    $ git push         # To push the hw3 branch (if not yet done)
    $ git push --tags  # To push hw3-1 (and any other tags)

Subsequent submissions should be named hw3-2, hw3-3, etc. We take the highest-numbered tag as Fred's final submission. He can submit at any time, even when he has many intervening commits. For example, if he has submitted hw3-1 and hw3-2 and decides that the last submission is bogus, and the first one was better, he can execute

    $ git tag hw3-3 hw3-1
which makes hw3-3, the latest submission, a synonym for hw3-1. Alternatively, if the commit he wants to submit was not previously tagged, Fred can find its unique id using git log and then tag that. For example, he might see
 
    $ git log
    commit ff39e11f5e292a0c81f3cb65c2a39c7b301a595a
    Author: Fred Student 
    Date:   Tue Jan 27 16:32:17 2015 -0800

        Experimentally refactor my solution to problem 3.

    commit 4f7d9e65744c8b528289746bf911cb81ded7c5e2
    Author: Fred Student 
    Date:   Wed Jan 26 15:36:28 2015 -0800

        Add tests.
        No errors detected so far.

    commit 2aea9782d7000bb07277617b9f81bea485374d27
    Author: Fred Student 
    Date:   Wed Jan 22 15:34:55 2015 -0800

        Begin work one hw3.
Now to submit the second commit back (from 1/26) as his first submission, he could execute
    $ git tag hw3-1 4f7d9e
(The unique ids in Git are hexadecimal SHA-1 hashcodes of the contents of the commits. You only need to specify a sufficiently long prefix of the hashcode to uniquely identify which commit you mean.)

Again, after adding any new tags, Fred must use git push --tags to push them to the repository that the staff (and autograder) see.

Submission dates and times will be taken from the time of the commit tagged by hw3-n, and not from the time the tag was created.

You can delete a tag locally, but we have set up the repository to prevent you from doing this on cs164-taa's repositories. It shouldn't be necessary in any case, since the autograder will ignore tags that don't refer to known assignments and you can always supercede a tag with a higher-numbered one.

Quick Summary

These commands assume you have account cs164-xxx and team OurTeam.
  1. To initialize Git on a particular system:
        $ git config --global user.name "Fred Student"
        $ git config --global user.email "fred.student@somemail.com"
        $ git config --global push.default simple      # Suggested
    
  2. To create a local copy of your personal repository in directory cs164-hw and connect it up our shared repository:
        $ git clone cs164-taa@ashby.cs.berkeley.edu:users/cs164-xxx cs164-hw
        $ cd cs164-hw
        $ git remote add shared cs164-taa@ashby.cs.berkeley.edu:shared 
    
  3. To start an assignment named ASSGN (e.g., hw3), from our template:
        $ cd cs164-repo        # If not already there
        $ git merge shared/ASSGN -m "Starting assignment ASSGN."
        $ git push 
    
  4. To see the current status of a repository, including files that have been added, removed, or modified; files that are in the working directory, but not in the current commit ("untracked"); and discrepencies between the current branch and the remote branch it is tracking (gets pushed to or pulled from):
        $ git status
    
    The message will tell you how to undo changes from the last commit, should you want to.
  5. To start tracking a file or directory F, so that it will be added to the repository on the next commit:
        $ git add F
    
  6. To commit modifications to all tracked files in the local repository:
        $ git commit -a
    
    This does nothing with untracked files.
  7. To transmit commits on the current branch to the remote (cs164-taa) repository:
        $ git push
    
  8. To fetch new commits for the current branch in the cs164-taa repository that have been pushed from another local directory (commit current work first):
        $ git pull --rebase
    
  9. To submit assignment ASSGN (assuming it is in the current branch):
        $ git tag ASSGN-n
        $ git push
        $ git push --tags
    
    where n is a sequence number larger than those of existing tags.
  10. To see tags that you have created (not necessarily pushed):
        $ git tag