/
Launch Online IDE

Discover Git

A distributed version-control system


Reading Time: 15 min

Using Git for version control is an essential element of developing on Corda. This introduction to Git includes tips and best practice that will be useful to beginners and occasional Git users alike.

In this section, you will get:

  • An overview of the concepts of Git.
  • An introduction to the vocabulary of Git.
  • A guide for using Git for the first time (and every time after that).

You may already be more than confident using Git, in which case, you may want to skip to the next section. You can always check back if you need to later.

Overview

Conventions like mass-emailing proposal-v2.1b-Final.doc are so 2001. What happens if multiple participants work on the same document at the same time? How are those changes merged into the final document? What about experimental branches that might make it to the final document, or might not? These are common problems that call for a comprehensive workflow and document versioning system. Git is such a system.

Git lets you work locally, completely disconnected from any remote server. It reconciles changes done by several people working independently, even within a single file and it includes a change-management scheme with versions and change history so previous versions of documents are always recoverable.

For these reasons, and more, it has gained considerable popularity in the software development community.

Vocabulary

A repository is a self-contained entity that contains all information, including the history. Where you work, typically with folders and files, is the working tree. All information necessary for Git to function properly is hidden away inside the .git sub-folder, which you should not touch unless you absolutely know what you are doing. Git is distributed, so one speaks of local and remote versions of a repository. Here is a remote repository of Corda. When you work on a project, you work on your local version. When data goes from local to remote, you say that you push, and from remote to local, you pull. If the data goes from remote to local, and remains hidden inside the .git folder, you call this a fetch. Sometimes, when people create a new copy of a repository, they say they forked it, which happens a lot for Corda; although forking is not a Git concept per se, rather a convention in the Git ecosystem. When you get an existing remote repository for the first time, you say you clone it.

The files that you see in your local repository are at a given version. Here is an example of a version of Corda. The version id is 452e78f531a3f16e63ed9aa62d6b165cdd4da2bf or 452e78f for short. This version id was calculated by SHA1-hashing the content of the whole working tree at this version, plus some metadata, and there is a pretty good chance that it is absolutely unique across all repositories of the world. See the reading list at the end for in-depth articles. A version has one or two parents, which is mentioned in the header 2 parents ea22a10 + 723399d. When a version has two parents, you call it a merge. In effect, the series of versions in a Git repository form a directed acyclic graph, a D.A.G.. Almost like the states and transactions in Bitcoin and Corda. You can see for instance here how all arrows are pointing to the right, with each dot symbolising a version. When you want to have a version in your working tree, you checkout (or switch to, if your Git is new) this version.

When you want to create a new version you first have to decide which of your changed files are going into this new version, you say you stage your changes, and then you commit them along with a text message. It is common to swap the word commit with version. Before staging, you can see what differences you made to your files compared with the last version. Git calls that a diff. When staging, if you use a convenient GUI tool, it can even help you select which lines to stage per file. And do not forget that all this staging and committing happens in your local repository. No remote is involved in these operations until you push. As an aside, this means that it is easy to uncommit and unstage, a.k.a. reset or restore, if you never pushed your new version.

By convention you make commits of files that make sense together, and / or commits that address a specific bug or feature. It is also a convention to describe in human language the purpose of the commit. Nothing good comes from making commits of 500 files with the message "Today's work". Another convention is to not commit files that can be fetched from elsewhere, like a library, or recreated, like a build folder. It is bad practice to commit private keys and other sensitive information. In order to achieve that with less effort, you are encouraged to populate the .gitignore file that lists the files to... ignore.

These versions / commits are all nice and dandy but their ids are not very friendly. So it is possible to tag selected ones in order to give them a human-friendly name like release-os-4.3 of Corda, which is the pretty name of id d679784.

Now, the reality of development is that it will take a number of versions with your teammates, or even yourself alone, before a bug is considered fixed, or a new feature implemented. Git provides such a logical pathway, named branch. For instance, here is the branch that hosts the evolution of Corda to version 4.3. The head commit of a branch advances commit after commit. The repository always has a default branch, most likely named master, although, for instance, at the time of writing, it is release/os/4.5 for Corda. A typical workflow is to have a bug fixed in a separate branch. Then, when it is ready to be merged into master, the developer opens a merge request, also called a pull request; here again, this is a convention in the Git ecosystem. On the other hand, if a branch cannot wait for what can be a lengthy process, and some changes are required from one branch to another, it is possible to cherry-pick them by specifying which commits are of interest; with the caveat that it replays the commit's diff on your branch, in effect creating a new commit.

Since using branches is very convenient, it may happen that you go back and forth between branches, even before you have changes that are sufficiently complete to stage and commit them. In these situations, you can stash away your unstaged changes, give these changes names, and pop them at a later date.

Just in case, this depiction of Git appears too rosy, rest assured that you will hit snags, the most typical ones being conflicts between your changes and someone else's. During these worrying times, or at any other times, you can always ask Git what is the status of your project. It happens to the best of us.

Use it

It is entirely possible that Git already comes installed with your computer. If not, head over to the installation guide. If you can, create a free account on Github and add an SSH key. It may seem daunting but it saves time later.

A bit of configuration. Let's do it now. In your command-line type:

$ git config --global user.name "John Doe"
$ git config --global user.email johndoe@example.com

Or whichever name and email you want to use. This will identify your commits, and, if your email is associated with an avatar on Gravatar, it will show on Github too.

With that work done, let's have fun with the command-line and the Corda samples repository.

  1. Go into your workspace, or whichever folder to which you want to download this repo:

    $ cd ~/Workspace
  2. Clone the repository and give it a name that won't make you scratch your head in 3 months:

    $ git clone git@github.com:corda/samples-java.git corda-samples-java

    Or, if you did not add an SSH key to your account:

    $ git clone https://github.com/corda/samples-java.git corda-samples-java
  3. Go into it:

    $ cd corda-samples-java
  4. Check the status

    $ git status
    On branch master
    Your branch is up to date with 'origin/master'.
    
    nothing to commit, working tree clean

    Right there you know that master is a branch and is currently the default one. Your working tree has nothing new. Also, origin is a remote, the one on Github. origin is the conventional name for a default remote.

  5. So what is this origin remote?

    $ git remote show origin
    * remote origin
      Fetch URL: https://github.com/corda/samples-java.git
      Push  URL: https://github.com/corda/samples-java.git
      HEAD branch: master
      Remote branches:
        devrel-1606            tracked
        master                 tracked
        oracle-primenumber_CCV tracked
      Local branch configured for 'git pull':
        master merges with remote master
      Local ref configured for 'git push':
        master pushes to master (up to date)

    All good, lots of information. Including, at the bottom that your local master branch is configured to push and pull from the remote branch of the same name. It is welcome, as it avoids confusion, but not necessary. You could well name your local branch something, and have it push to a different name on the remote.

  6. Check the version this branch is at

    $ git log | head -n 20
    commit 03fe800534ca2a684b0cb892a379a8c7eebbe259
    Merge: 4a6314f 89988e9
    Author: peterli-r3 <51169685+peterli-r3@users.noreply.github.com>
    Date:   Thu May 7 09:34:58 2020 -0400
    
        Merge pull request #2 from corda/oracle-primenumber_CCV
    
        oracle-primenumber: added checkCommandVisibility to oracle service
    
    commit 4a6314fd4e65087b556aa94d8696e9b391803a81
    Merge: a61e2cc cf21fb3
    Author: peterli-r3 <51169685+peterli-r3@users.noreply.github.com>
    Date:   Thu May 7 09:34:46 2020 -0400
    
        Merge pull request #3 from corda/training_material_readme
    
        README for training samples inserts
    
    commit cf21fb3dbf9972d61eb9b543e6fe79698f843110
    Author: Anthony Nixon <anixon604@gmail.com>

    You can see commit ids and messages, developer names and dates.

  7. If you want to checkout the previous version, so back by 1 commit, in your working tree:

    $ git checkout HEAD~1
    Note: checking out 'HEAD~1'.
    
    You are in 'detached HEAD' state. You can look around, make experimental
    changes and commit them, and you can discard any commits you make in this
    state without impacting any branches by performing another checkout.
    
    If you want to create a new branch to retain commits you create, you may
    do so (now or later) by using -b with the checkout command again. Example:
    
      git checkout -b <new-branch-name>
    
    HEAD is now at 4a6314f Merge pull request #3 from corda/training_material_readme

    Yep, 4a6314f is the second commit id you saw previously. You are detached, meaning that you are no longer tracking your master branch, and you are not going to update the branch either. That can be a problem if you start creating something.

  8. Find your place in the DAG of commits

    $ git log | head -n 20
    commit 4a6314fd4e65087b556aa94d8696e9b391803a81
    Merge: a61e2cc cf21fb3
    Author: peterli-r3 <51169685+peterli-r3@users.noreply.github.com>
    Date:   Thu May 7 09:34:46 2020 -0400
    
        Merge pull request #3 from corda/training_material_readme
    
        README for training samples inserts
    
    commit cf21fb3dbf9972d61eb9b543e6fe79698f843110
    Author: Anthony Nixon <anixon604@gmail.com>
    Date:   Thu May 7 00:50:34 2020 +0100
    
        README for training samples inserts
    
    commit a61e2cc9910d7d5de83122bf7d36fd071796a7c3
    Author: peterli-r3 <51169685+peterli-r3@users.noreply.github.com>
    Date:   Wed May 6 12:10:13 2020 -0400
    
        add cordapp-example/springserver

    You no longer know about the previous head at 03fe800. Of course, the commit was not deleted, it just is not in your working tree.

  9. Proof:

    $ git checkout 03fe800
    Previous HEAD position was 4a6314f Merge pull request #3 from corda/training_material_readme
    HEAD is now at 03fe800 Merge pull request #2 from corda/oracle-primenumber_CCV

    This commit id was there, hidden inside the .git folder, and you can call it back at any moment.

  10. Status?

    $ git status
    HEAD detached at 03fe800
    nothing to commit, working tree clean

    Admittedly, commit ids are not a friendly way to go about. And although you are at exactly the same commit as the current head of master branch, you are still detached. Do you want to "reattach"?

  11. What are your branches again?

    $ git branch --list
    * (HEAD detached at 03fe800)
      master
    (END)

    Press q to quit. Ok, this is my local branch. Thank you for letting me know about only the branches I previously worked with, but...

  12. Where did all those branches you saw at the beginning go?

    $ git branch --all
    * (HEAD detached at 03fe800)
      master
      remotes/origin/HEAD -> origin/master
      remotes/origin/devrel-1606
      remotes/origin/master
      remotes/origin/oracle-primenumber_CCV
    ...

    Phew!

  13. You can now get back on track:

    $ git checkout master
    Switched to branch 'master'
    Your branch is up to date with 'origin/master'.

    Now you can start working in your working tree.

  14. Let's pretend that appending a ; at the end of the file is a worthy contribution to an open-source project:

    $ echo ";" >> Basic/cordapp-example/contracts-java/src/main/java/com/example/state/IOUState.java
  15. And see where you are at:

    $ git status
    On branch master
    Your branch is up to date with 'origin/master'.
    
    Changes not staged for commit:
      (use "git add <file>..." to update what will be committed)
      (use "git checkout -- <file>..." to discard changes in working directory)
    
        modified:   Basic/cordapp-example/contracts-java/src/main/java/com/example/state/IOUState.java
    
    no changes added to commit (use "git add" and/or "git commit -a")

    Some help is always visible, and the messages are often clear.

  16. What is your diff?

    $ git diff
    diff --git a/Basic/cordapp-example/contracts-java/src/main/java/com/example/state/IOUState.java b/Basic/cordapp-example/contracts-java/src/main/java/com/example/state/IOUState.java
    index 6b5af69..6acc240 100644
    --- a/Basic/cordapp-example/contracts-java/src/main/java/com/example/state/IOUState.java
    +++ b/Basic/cordapp-example/contracts-java/src/main/java/com/example/state/IOUState.java
    @@ -70,4 +70,4 @@ public class IOUState implements LinearState, QueryableState {
         public String toString() {
             return String.format("IOUState(value=%s, lender=%s, borrower=%s, linearId=%s)", value, lender, borrower, linearId);
         }
    -}
    \ No newline at end of file
    +};

    Not always very clear, but at the end the last line } is removed, that's -, and as a replacement, that's +, you have };. Exit with q.

  17. Let's stage it:

    $ git add Basic/cordapp-example/contracts-java/src/main/java/com/example/state/IOUState.java

    That's it?

  18. And your status?

    $ git status
    On branch master
    Your branch is up to date with 'origin/master'.
    
    Changes to be committed:
      (use "git reset HEAD <file>..." to unstage)
    
        modified:   Basic/cordapp-example/contracts-java/src/main/java/com/example/state/IOUState.java

    Clear enough.

  19. Time to commit:

    $ git commit -m "Added a semicolon at the end, you're welcome."
    [master bcd90e3] Added a semicolon at the end, you're welcome.
     1 file changed, 1 insertion(+), 1 deletion(-)

    Ok. An insertion is an added line, a deletion is a deleted line.

  20. Status please:

    $ git status
    On branch master
    Your branch is ahead of 'origin/master' by 1 commit.
      (use "git push" to publish your local commits)
    
    nothing to commit, working tree clean

    Do you want to push?

  21. Let's try:

    $ git push
    ERROR: Permission to corda/samples-java.git denied to YOUR_NAME.
    fatal: Could not read from remote repository.
    
    Please make sure you have the correct access rights
    and the repository exists.

    Of course, there are access rights to it. This is not your repository.

  22. But you know what? This commit was not very helpful, and since it never left your computer, let's revert it:

    $ git reset HEAD~1
    Unstaged changes after reset:
    M    Basic/cordapp-example/contracts-java/src/main/java/com/example/state/IOUState.java

    And you are back to before staging, with your extra semicolon.

  23. Let's revert this minor change.

    $ git checkout -- Basic/cordapp-example/contracts-java/src/main/java/com/example/state/IOUState.java

    That's it?

  24. And my status?

    $ git status
    On branch master
    Your branch is up to date with 'origin/master'.
    
    nothing to commit, working tree clean
  25. Do you want to start contributing to the corda/samples repository for real? Ok, let's do that. Click on the top right "Fork" button here, and pick which Github account to copy it to. It takes a second. Make a note of the repo it was copied to, when you click on the "Clone or download" button. It should be of the form git@github.com:YOUR_NAME/samples-java.git.

  26. Ok now, you have 2 remotes, the official one and yours. How do you differentiate them? The convention is to name your repo origin, and the official one upstream, so first, you add upstream:

    $ git remote add upstream git@github.com:corda/samples-java.git

    Or, if you did not add an SSH key:

    $ git remote add upstream https://github.com/corda/samples-java.git
  27. What remotes do you have?

    $ git remote show
    origin
    upstream

    Ok, no surprise there. Where do they point to?

    $ git remote show origin
    * remote origin
      Fetch URL: git@github.com:corda/samples-java.git
      Push  URL: git@github.com:corda/samples-java.git
    [...]
    $ git remote show upstream
    git remote show upstream
    * remote upstream
      Fetch URL: git@github.com:corda/samples-java.git
      Push  URL: git@github.com:corda/samples-java.git
    [...]

    Mmh, the same. Not very helpful.

  28. Let's update the URL for origin to point to your remote:

    $ git remote set-url origin git@github.com:YOUR_NAME/samples-java.git

    You can do another git remote show origin to make sure.

  29. With this set, let's make and push a change, the same silly one you did earlier:

    $ echo ";" >> Basic/cordapp-example/contracts-java/src/main/java/com/example/state/IOUState.java
    $ git add Basic/cordapp-example/contracts-java/src/main/java/com/example/state/IOUState.java
    $ git commit -m "Added a semicolon at the end, you're welcome."
    [master bcd90e3] Added a semicolon at the end, you're welcome.
     1 file changed, 2 insertions(+), 1 deletion(-)
    $ git push
    Enumerating objects: 21, done.
    Counting objects: 100% (21/21), done.
    Delta compression using up to 8 threads
    Compressing objects: 100% (6/6), done.
    Writing objects: 100% (11/11), 742 bytes | 742.00 KiB/s, done.
    Total 11 (delta 4), reused 0 (delta 0)
    remote: Resolving deltas: 100% (4/4), completed with 4 local objects.
    To github.com:YOUR_NAME/samples-java.git
       feca7a27..882d83a0  master -> master

    Ha! Done, it is in the origin remote. And if you go to it on Github, it will tell you that This branch is 1 commit ahead of corda:master. with a pull request and Compare buttons.

That will do for now. Install a Git GUI tool if you are not a command-line warrior, and remember that for all Git questions, Google is your friend.

Discuss on Slack
Rate this Page
Would you like to add a message?
Submit
Thank you for your Feedback!