cat /dev/brain

Git(hub) Basics

After encountering some problems working on github3.py I stopped by #github on Freenode to see if anyone else had come across the same problems. While there, I noticed a lot of people asking some basic questions about how to do things with respect to git and Github. They seem like they would be fairly common questions that could be answered by simply visiting Github's help pages but I figured I may as well touch on a few of them anyway. Sometimes having a second explanation is worthwhile.

What are the first few things I should do?

Any basic git primer will tell you that when you have your files that you want to start tracking you start with:

git init

which initializes git [1]. Essentially this sets up the basic .git/ directory and sub-directories. For purposes of this post, I set up a directory ~/sandbox/test/:

~/sandbox$ mkdir test ; cd test
~/sandbox/test$ ls -a
./  ../
~/sandbox/test$ git init
Initialized empty Git repository in /home/icordasc/sandbox/test/.git/
~/sandbox/test$ ls -a
./  ../  .git/
~/sandbox/test$ ls -a .git/
./  ../  HEAD  branches/  config  description  hooks/  info/  objects/  refs/

Next you would add the files you wish to be tracked with:

git add

Let's first get some files to work with:

~/sandbox/test$ echo 'Foo' >> foo.txt
~/sandbox/test$ echo 'Bar' >> bar.txt
~/sandbox/test$ echo 'Baz' >> baz.txt

If you want to track all three of them then you simply run:

~/sandbox/test$ git add .

This will add everything in the current directory (also known as . or ./). If you don't want to add everything, don't despair, simply call:

~/sandbox/test$ git add foo.txt bar.txt

Next you want to write the first commit message [2]:

~/sandbox/test$ git commit [-m 'Initial Commit. Start project test.']

If you would rather edit the commit message in an editor, you simply exclude the part in brackets. Once you've written your commit you can go to Github, and create a new repository for yourself. Name it whatever you please, give it a description, home page, and whatever else you desire. You do not want to initialize the repository with a README at this point in time. From now on, I'll use USER to denote your username and REPO to denote the name you gave to the repository upon creating it. Next you'll run:

~/sandbox/test$ git remote add origin git@github.com:USER/REPO
~/sandbox/test$ git config branch.master.remote origin
~/sandbox/test$ git config branch.master.merge refs/heads/master

The first line tells git that you're adding a remote repository (i.e. a repository that is not on your computer). origin is the name you gave to your remote. (origin is the default name when you clone a repository.) git@github.com:USER/REPO is the ssh url to your repository [3]. At this point your git repository just knows where its remote should be. It has not communicated with it yet. Before you do that, you should make sure that your master branch [4] will use origin as its remote and that when you need to perform a merge on master, you know what references from the remote you'll want to use for the merge.

Finally, the last thing you'll do to get your repository onto Github is:

~/sandbox/test$ git push origin master:master

This syntax is important to note now. You're telling git to push to origin and to use the local branch master to create a branch called master on the remote. If you in the future you've created some other branch, e.g., testbranch and you have it on your remote but would like to delete it, you simply run:

~/sandbox/test$ git push origin testbranch:

Which tells the remote to delete that branch.

Forking a repository on Github

So let's say you've forked a repository on Github. If you use that fork as your main repository, you will not get changes from the main repository, you have to retrieve those yourself. Do not worry, this will be a lot simpler than setting up your git repository.

Let us say that you forked my repository of at sigmavirus24/github3.py and your fork is test/github3.py. I'll assume that you simply ran:

~$ git clone git@github.com:test/github3.py

On your computer so that the default remote origin will be set to your fork. To get the changes, from sigmavirus24/github3.py you first want to add a new remote. You can have as many remotes for a repository as you desire, you just need to ensure that each one has a unique name. To follow common practice, we'll add my remote to your repository like so:

~$ git remote add upstream git://github.com:sigmavirus24/github3.py

Why do I say "common practice"? It is common to call the "official" repository "upstream". It is also common to use the git read-only link when you are not a direct contributor to a project. (For example, trying to use git@github.com:sigmavirus24/github3.py would fail since you do not have ssh access to the repository.) Next you'll want to update your copy of upstreams refs. Simply:

~$ git fetch upstream

And finally, assuming you want to merge branch master from upstream you can simply do:

~$ git merge upstream/master [--no-commit]

If there are no merge conflicts you should not need --no-commit. Assuming you're going to be contributing back to the project, not having merge commits in your git history will make for cleaner pull requests.

Pull Request by Hand

First you need to add a remote for the repository the pull request is coming from. It's the same syntax as in Forking a repository on Github. You also fetch the upstream refs but you do not merge unless you're merging everything.

If there are only a few commits from a fork of your project or in a pull request that you want, you can cherry pick them:

~$ git cherry-pick <commit>

You can specify one commit at a time or you can use a range:

~$ git cherry-pick <commit-0>...<commit-n>

You have to be on the branch that you want to pull those commits onto, e.g., to pull commits onto master you should be on branch master in the first place.

Config & Aliases

Config

There is a git configuration file for each repository as well as a global config file for your user. I showed you where you can find the repository config in What are the first few things I should do?, i.e., ~/sandbox/test/.git/config. The global config file, however, is usually ~/.gitconfig. Here you can define your default user name and email address as well as aliases.

Aliases

An alias works by allowing to use a different name for a git command. For example, my three go-to aliases are defined in my global config as:

[alias]
    co = checkout
    cp = cherry-pick
    stat = status

So I can now use:

~/sandbox/todo.py$ git co master

Instead of:

~/sandbox/todo.py$ git checkout master

Personally, I like typing fewer keys and the more efficient I can git, the more time I have to spend on other things, like development.

[1]You also may want to set up your username/email address combination now with git config user.name "username to use" and git config user.email "email@example.com".
[2]Typically commit messages start with one line maxing out around 50 characters. After that, you skip a line and then write as much as you want with line lengths preferably < 80 characters.
[3]I may be assuming too much here. To use ssh with Github you have to have a public ssh key that you give them so they know you are who you say you are. If they don't have one on record, you can not use the ssh url.
[4]The master branch is the default branch created by git.