If you use a computer, you’ve probably been faced with the issue of different versions of files. Whether it’s a matter of text files, data files, or program code you’ve been writing, you’ll always be faced with the decision of whether to save and replace the previous version or save in addition to the previous version.
For single developers, it’s already a challenge to keep track of different versions of a program, not to mention all the different backups you might make.
Now imagine you’re working as part of a team on a project that involves text, multimedia, and coding – in short, a wide variety of information in the works.
Imagine further: say each data or program file is worked on by several people (“concurrency”) — each person adding, saving, editing, and saving again. How on earth will you know who did what, when, and why, and be able to track all of these changes?
You need version control (VC).
In this article, we’ll look at what version control is and how it has evolved over the years leading up to the latest generation.
In particular, we’ll look at Git, the increasingly popular version control application from the people who made Linux, and see some examples of how you would use Git to regain control of all those different revisions of your files.
A Primer for VC Jargon
Version control has its own language. After specifying which directories or group of files should have their changes tracked is known, the directories or files are known as a repository, or repo.
Changes are tracked automatically, but they are only recorded as a single collection of actions, called a commit, and recorded as a changeset with a unique revision number. This ensures you can call up the latest version of a file.
If you want to compare two revisions (for example, if a bug crept into your code at some point), the version control tool should let you diff two files, meaning see the differences between the two.
To experiment with a repo without risking problems or damage, you can create a branch, meaning a copy of the repo, you can then modify in parallel. If the changes in the branch are satisfactory, you can then merge the branch with the main repo (master), or even another branch.
When merging, modern version control systems are usually smart enough to figure out which changes should be included from which branch or repo, according to the change history maintained for each one. If a version control system can’t decide, then you may have to manually resolve a conflict.
Version Control Systems Evolution
Version control tools have appeared in three generations so far, each generation adding flexibility and possibilities for concurrency.
With original version control systems, although multiple people could work on the same file, they could not do so simultaneously. The file was locked to prevent others from accessing it at the same time.
An example of such a tool is SCCS (Source Code Control System) for software development from 1972 onward. RCS (Revision Control System) was created as the free alternative to SCCS and offered faster operation, branches, and merging (still only permitting one developer to work on a file at a given time).
Many version controls in operation today are in this category. Simultaneous modifications on files are possible, although users must merge current revisions in their work before they can commit.
CVS (Concurrent Versions System) is one instance and allows client/server interactions with the use of a repository. SVN (or Apache Subversion in full) is possibly the most popular of all version control systems in use today.
SVN can be thought of as a redesign of CVN with a modern foundation and solutions to former CVS limitations.
Also known as DVCS (Distributed Version Control Systems), with the possibility to separate merge and commit operations, one of the best known examples is Git.
There is no longer a centralized base for files; different branches hold different parts, which opens the door to working on revisions offline as well.
A Real Example Using Git
How do the operations described above look when using a real life version control system?
We take Git as an example here, using the Linux command line. First, we create a Git repository for the directory we are in currently. We use the pwd command to see where we are:
Then we use the git init command to create the repository (the “master” repository) and get confirmation back from Git:
$ git init
Initialized empty Git repository in /Users/HJ/Desktop/repos/apps/.git
Suppose we add a new file, main.c, to our working directory. Using the git status command will give us the following information:
$ git status
# On branch master
# Initial commit
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
We use git add to track the file main.c
$ git add main.c
We use git commit with a message (-m option) about what we’re doing to commit changes in the main.c file.
$ git commit -m "adding main.c to repository"
Now we can create a branch (e.g., “test”) with the git branch command:
$ git branch test
Using the git branch command again on its own simply lists the repositories we now have:
$ git branch
Finally, to start working in the “test” branch on the copy of main.c now in that branch, we use the git checkout command to get confirmation that we are now working in the “test” branch.
$ git checkout test
Switched to branch "test"
To get back to the “master” branch, simply use the git checkout command again:
$ git checkout master
Switched to branch "master"