Project

General

Profile

Using GIT

This page provides some GIT "best practices" based upon the experiences of the page's author. I welcome suggestions from other GIT users to help improve this page and to learn other ways to use GIT (which is a very useful, albeit complicated, tool.) Some helpful links to GIT resources:

and to lighten things up:

Creating a Remine remote repository

Go to a project's settings and click the "repository" tab. Select Git as the repo format. There can only be one main repository, but there can be many secondaries, each requires a unique identifier. You can use an identifier for the main repository, but it is usually left blank. When blank, the main repository's identifier is set to the project's identifier. Keep the identifier field simple; it gets appended to the project's identifier, and you run into problems when the combined name is over 30 characters. For the path to the repo, use "create_me". Enable the "Report last commit” feature.

Wait 1/2 hour, or so, for the system to create the empty repository. If you refresh the "Settings/Repository" page, you’ll eventually see "create_me" change to an actual path to the repository.

Cloning a Repository

git is a distributed SCM system which means, in git's view, no repository is more important than any other. It is a project's agreed upon conventions that a particular repository is the "official" one, not git's. Redmine projects can have a git repository associated with them. In these cases, the Redmine copy is the "official" copy and every remote copy tries to keep synchronized with it.

Project repositories in Redmine have a unique identifier. To allow all members of a project to make changes, Redmine creates a user account and adds the project's members to its .k5login file. An active valid Kerberos ticket is required to interact with Redmine. The user account is the same as the identifier with "p-" prepended. If, for example, your project's identifier was my-project, you could clone its repository with this command:

$ git clone ssh://p-my-project@cdcvs.fnal.gov/cvs/projects/my-project

This command creates a subdirectory called my-project and places the contents of the repository in it. This directory is called the working directory and contains not only source code you can edit, but has a copy of the repository itself (found in the .git directory -- don't mess with the contents of this directory!) git remembers, in its local configuration, from which repository your copy came. By default, it is given the label origin (see the git remote command below.)

Saving Your Changes

After a development/debugging cycle, you'll have some changes that need to be saved to your local repository. In the next section, we'll show how to forward these commits to the remote repository.

Saving to the Local Repository

Committing to the local repository takes two steps in git. The first step is to specify what needs to be committed. The second step is to actually commit it. This may seem like extra work, and it is, but it provides some powerful abilities that other SCMs don't allow.

$ git add file1 [file2 ...]

This command adds all the changes in the specified files to the commit staging area. This is the same command when adding a new file to git (the changes being staged, in this case, are the entire contents of the file.)

$ git commit

All the commits in the staging area will be committed. An editor will be started so the commit message can be entered.

$ git add -p

This is a powerful version of the add command. The -p option puts the add command in an interactive "patch mode": git presents each group of changes in each file and asks whether you want to add them to the commit. For instance, while fixing a bug you discover a typo in an unrelated comment. Rather than making one commit with both of your fixes, you can use "git add -p" to add the typo patch, commit it, then stage the rest of the changes and commit them. Keeping each commit as small as possible helps when you get the inevitable conflict when merging, so the -p option is a fantastic feature.

For those of you that don't subscribe to formal processes, the following command adds all changes to the staging area and commits with one command. This essentially does the same as cvs commit. I don't recommend using it, though.

$ git commit -a

Saving to the Remote Repository

After a "git commit", your changes are only in your local copy of the repository. git is a DVCS with no central, official repository so committing to the local repo is considered enough. This allows developers to work on projects and commit their work without a network even being present. It's a project's decision to specify which repository eventually needs to receive everyone's efforts.

In most cases, you "push" your changes back to the original repo with the push command.

$ git push

When a project is cloned, git automatically records from where the source came. It assumes that eventually you're going to send your changes back to it. git gives it the easier-to-remember label, "origin".

When a branch is first pushed to a remote repository, you need to specify the target repo and the branch name. In fact, if you clone an empty repository (like when you make a new project on Redmine), you have to use this extended command to push the first commit(s) of the master branch!

$ git push origin master

Commands Providing Information

When in the working directory, git provides several commands that provide useful information.

$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean

This command shows the current status of the working directory. The top of the output indicates the branch checked-out. The command also shows which files are ready to commit; which files have changes, but haven't been staged for commit; and which files aren't tracked by git at all. git status doesn't use the network so it doesn't contact the remote repository and only uses information that was cached during the last git pull and git push commands. This means if you've committed two times since the last git pull, the output from git status will mention you're "2 commits ahead of 'origin/master'", or something like that.

$ git log

Displays the history of commit messages along with the commit hash, date of commit, and who committed it.

$ git diff

The diff command shows all the differences between the files in the working directory and the staging area (not the repository!) In other words, it shows the changes that haven't been staged to be committed. The diff command has many, many options. All git commands recognize a set of special symbols. HEAD, for instance, refers to the head commit of the current branch. So "git diff HEAD" shows the differences between the working directory and your local repository, bypassing the staging area. You can also compare against a remote repo with "git diff origin/master" which compares the working directory against the master branch on the remote repository, origin.

$ git branch -v

Shows all the branches in your local repository and marks the one currently checked-out in your working directory (with an asterisk.)

$ git remote -v

Shows all remote repositories the local repository interacts with. 99% of the time, this will only display the default remote repository, origin, with its network path.

Getting Updates from Remote

The main purpose of SCMs is to keep software in sync with all its developers. This means occasionally pulling in changes made by colleagues. git's pull command is the simplest way to do this.

$ git pull

This command retrieves, from the remote repository, changes not present in the local repository and applies them. Then it attempts to merge the changes to the checked out copy. Usually the merge succeeds but, once in a while, the remote change modifies a section of code that has local changes. This is called a "conflict". git will mark the conflicting areas of the file and won't finish the merge until you resolve all conflicts. Using an editor, review the marked up areas and decide how the final version should look. When all conflicts have been resolved, use "git add" to add the resolved files and "git commit" to finish the merge.

"git pull" and "git push" can get most of the job done. However, when two developers have their own changes at the head of the branch, merging them creates a slight mess. Part of git's philosophy is to make sure the development history is clear and easy to understand. This section will discuss how to use branches to make development more robust and keep the project's history organized. The following is not the only way to use branches, nor is it the way. It is, however, a useful method the author uses that works well.

Branching

By default, git names the main branch of development "master". Released code (i.e. code to be used by other projects) is typically taken from the end of the master branch. Following this convention requires anything committed to master to be reasonably debugged and compiles cleanly. This may be impossible while debugging or implementing new features so branches should be used to isolate unfinished work from the master branch.

For the following discussion, let's assume a project called "Project" exists on a remote machine and has three commits, A, B, and C, on the master branch:

Remote Repository:

    [A]-->[B]-->[C]
                 ^
                 master

Creating a Branch

When you clone a repository, unless otherwise specified, you'll be using the master branch.

$ git clone PATH_TO_PROJECT project
$ cd project

Your project's environment is now:

Remote Repository:

    [A]-->[B]-->[C]
                 ^
                 master

Local Repository:

    [A]-->[B]-->[C]
                 ^
                 master

Working Directory (on master branch):

    C

Note that the local repository matches the remote's and your working directory matches the state of the project that was last committed.

You want to do some development on this version of the code without affecting the master branch until you're through, so you create a branch. You can name it anything you wish but in this example we'll call it devel:

$ git checkout -b devel

Now your project's environment has slightly changed:

Remote Repository:

    [A]-->[B]-->[C]
                 ^
                 master

Local Repository:

    [A]-->[B]-->[C]
                 ^
                 master
                 devel

Working Directory (on devel branch):

    C

A new branch called devel is created which refers to the same changeset as master and your working directory is now tracking it. But now we can commit changes to the devel branch without affecting master. Assume we've made changes to the files in the project. The source code is in a new, unique state which we'll call "D". If these changes are committed, the new environment becomes:

Remote Repository:

    [A]-->[B]-->[C]
                 ^
                 master

Local Repository:

    [A]-->[B]-->[C]-->[D]
                 ^     ^
                 |     devel
                 master

Working Directory (on devel branch):

    D

Note that master hasn't changed. Also remember that you don't want to push your devel branch back to the remote server. devel is your local development and, once you're done with your changes (and you tested them!), you'll merge them back into master and push to the remote server.

Switching Branches

You can see what branches are available in your local repository:

$ git branch -v
  master 0123456 Commit message for C
* devel  6543210 Commit message for D

The output shows the names of the branches (the asterisk shows which branch is current checked-out in the working directory.) The next field is the commit hash followed by the commit message at the head of the branch.

Use the checkout command to switch between branches. Using the example environment from the previous section,

$ git checkout master

yields:

Remote Repository:

    [A]-->[B]-->[C]
                 ^
                 master

Local Repository:

    [A]-->[B]-->[C]-->[D]
                 ^     ^
                 |     devel
                 master

Working Directory (on master branch):

    C

Switching back to devel:

$ git checkout devel

yields:

Remote Repository:

    [A]-->[B]-->[C]
                 ^
                 master

Local Repository:

    [A]-->[B]-->[C]-->[D]
                 ^     ^
                 |     devel
                 master

Working Directory (on devel branch):

    D

Note that switching branches doesn't affect the repositories. It only affects the contents of the working directory and which branch the working directory is tracking.

Merging Branches

Let's assume you've made a few more commits, states E and F, and you're ready to merge them back into the main repository. At this point, the project environment looks like this:

Remote Repository:

    [A]-->[B]-->[C]
                 ^
                 master

Local Repository:

    [A]-->[B]-->[C]-->[D]-->[E]-->[F]
                 ^                 ^
                 |                 devel
                 master

Working Directory (on devel branch):

    F

To merge it back properly, we switch to the master branch, make sure no one has added to it, merge devel, and push master back to the remote repository. You'll note that in the above diagram, the remote repository matches the local's master branch. Later, we'll cover the case when some one has pushed commits to the remote repository.

First, switch to the master branch:

$ git checkout master

Make sure no one added to master:

$ git pull
Already up-to-date.

The "Already up-to-date." message means you're in sync! This is best situation to have since it means everything else will go smoothly. It's safe to merge and push the changes in devel.

$ git merge devel

yields:

Remote Repository:

    [A]-->[B]-->[C]
                 ^
                 master

Local Repository:

    [A]-->[B]-->[C]-->[D]-->[E]-->[F]
                                   ^
                                   devel
                                   master

Working Directory (on master branch):

    F

By the way, in the output generated by the merge, you may see the term "Fast-forward". This means git was able to simply move the head of the branch to a new commit position without performing any diff calculations. Fast-forward merges are the cleanest merges since they don't generate any new artifacts in the project's history.

Now we can push the master branch to the remote repository:

$ git push

yielding:

Remote Repository:

    [A]-->[B]-->[C]-->[D]-->[E]-->[F]
                                   ^
                                   master

Local Repository:

    [A]-->[B]-->[C]-->[D]-->[E]-->[F]
                                   ^
                                   devel
                                   master

Working Directory (on master branch):

    F

Congratulations! Your changes are incorporated in the project and are available to other developers! Also, the project's history is easy to follow1.

Remote Changes

At Fermilab, most projects are worked on by one person at a time which means the previous section's example is what normally happens. However, if you are working on an active project that has multiple developers, you'll occasionally have to pull in their changes before pushing yours. Let's reset the environment back to the previous section's starting point, except let's assume the remote repository has been committed to while we were developing our code:

Remote Repository:

    [A]-->[B]-->[C]-->[G]-->[H]
                             ^
                             master

Local Repository:

    [A]-->[B]-->[C]-->[D]-->[E]-->[F]
                 ^                 ^
                 |                 devel
                 master

Working Directory (on devel branch):

    F

When we switch to the master branch and pull changes from the remote repository, instead of getting "Already up-to-date.", you see output showing which files have been updated (due to the remote changes.) The environment has now diverged:

Remote Repository:

    [A]-->[B]-->[C]-->[G]-->[H]
                             ^
                             master

Local Repository:

                   +->[G]-->[H]
                  /          ^
    [A]-->[B]-->[C]          master
                  \
                   +->[D]-->[E]-->[F]
                                   ^
                                   devel

Working Directory (on master branch):

    H

The master branch matches the remote's history, which is good. However devel is based off the state C and doesn't incorporate the changes of G or H. Before we can confidently merge our changes, we need to make sure G and H didn't break our code (because G and H are on master we have to assume H is the new, stable version of our project.) git has a powerful command, rebase which helps us in this situation. Go back to the devel branch and tell git to re-base our branch to the new master point:

$ git checkout devel
$ git rebase master

If there were no conflicts, the new environment state is:

Remote Repository:

    [A]-->[B]-->[C]-->[G]-->[H]
                             ^
                             master

Local Repository:

    [A]-->[B]-->[C]-->[G]-->[H]-->[D']-->[E']-->[F']
                             ^                   ^
                             |                   devel
                             master

Working Directory (on devel branch):

    F

The rebase command saves all the commits on devel from where devel and master split and then applies them, one by one, to the new head of master. The diagram shows the applied commits as D', E', and F' because they may not be the exact same patches as before since they may be applied to a file changed by G or H. If a conflict occurs, git will stop to let you resolve it. Once resolved, use "git add file" to add the file and then "git rebase --continue" to apply the rest of the commits.

At this point, you should re-test your changes because the other developer's commits may have affected the features you added. This may require you to commit more on the devel branch. Once you've tested the code again, start over at the top of this section: switch to master; pull from remote; if no changes from remote, merge from devel; if remote has changes, pull and rebase. Repeat, ad nauseum.

Tagging

git supports tagging a state of the project with a label. The command is

$ git tag TAG ITEM

where ITEM can be a commit hash, a branch name, or another tag. Using the last example's environment, if we wanted to tag state C of the project with the label "v1.0", we would find the hash of state C using "git log" and then tag it with

git tag v1.0 STATE_C_HASH

resulting in:

Remote Repository:

    [A]-->[B]-->[C]-->[G]-->[H]
                             ^
                             master

Local Repository:

    [A]-->[B]-->[C]-->[G]-->[H]-->[D']-->[E']-->[F']
                 ^           ^                   ^
                 v1.0        |                   devel
                             master

Working Directory (on devel branch):

    F

To push the tag to the remote repo:

$ git push --tags

Branches and tags are very similar in git; they both mark a state in the history of a project. The big difference is that when your working directory is attached to a branch "tag" and you commit, the branch is updated to refer to the new commit.


1 One might argue that the master branch contains the "broken" states of D and E. It is possible, using git , to merge all the changes in devel into one commit on the master branch so no one could gain access to the D and E states. This article doesn't cover that usage.