wiki:SubversionPage

Subversion Version Control System

The Cloudy project is housed in a Subversion repository. Wikipedia has a summary here. You must have Subversion installed on your computer to carry out the operations described below. The Subversion project home page, subversion.apache.org, has links to installation packages. Once installed Subversion is accessed from the command prompt as svn.

There are add on packages that can integrate Subversion with your development environment or file manager. TortoiseSVN is a very popular Windows explorer add on.

The Subversion repository - the does and don'ts

The Cloudy repository consists of three main sections:

  • The "trunk" (also often referred to as the mainline). This is the bleeding edge version of Cloudy. We try to keep this version in a near-stable state to encourage people to actually use it and discover bugs that way. To ensure its integrity it comes with a test suite that is run every night. Test suite cases that break should be fixed quickly.
  • The "branches" directory. This contains any number of subdirectories, each representing one branch. Branches come in two flavors - release branches and development branches. These are very different things, but they have in common that development is done on both types. They will be explained in more detail below.
  • The "tags" directory. This contains snapshots of the code at any given time. It currently consists of 4 subdirectories, each containing a specific type of tag. The "release" subdirectory contains only tags of official releases which are always created off a release branch. The "patch_versions" directory contains patch updates to a release, which are also created off a release branch (see HotFixToDo for more details). The "stable" subdirectory contains only snapshots of stable versions of the trunk. The "develop" subdirectory contains everything else, e.g. release candidates or snapshots just before or after a major merge. They could be snapshots of the mainline or any of the branches. The fact that they are snapshots implies that no development is done on tags. Feel free to use them liberally as creating tags is "cheap" (they take up very little disk space).

Commit messages

Starting the message on commits to branches with the branch name allows them to be identified in the overall timeline.

Where a commit addresses a ticket, including the number as '#123' in the commit message means that trac will automatically turn the version in the source browser into a link to the ticket -- likewise, when you close the ticket you should refer to the revision 'r123' at which it is fixed in the message.

The r123 also works in commit messages, of course, which is useful for tracking merges -- r123:456 gives a range. See TracLinks for more details.

Development branches

From the previous it is clear that development is either done on the mainline or on one of the branches. Typically development on the mainline should be quick and non-disruptive, e.g. bug fixes or small improvements of the code. If a project requires extensive recoding which disrupts other work it is better to create a development branch and do the work there. Once the project is finished, the work needs to be merged back onto the mainline. Past experience has shown that this can be a painful experience if you are not prepared properly. Below you will find various guidelines that will help you keep the pain to a minimum. Exactly where the boundary is between minor work that can be done directly on the mainline and major work that needs to be done on a development branch is open for discussion. This needs to be decided on a case-by-case basis. The same applies to the question when and how often you merge a development branch back onto the mainline. After the project has finished, a development branch may be deleted.

Release Branches

Release branches are very different from development branches. At given intervals (typically once a year) the code will be prepared for a new release. The first step in this process is creating a release branch. It is not permitted to do any development on the release branch, i.e. no new functionality may be introduced there. Only bug-fixes are permitted on a release branch (preferably after they have been tested on the mainline in case the bug exists there as well). This should assure that the release branch remains stable and only gets better by fixing bugs. After the release branch has been created it will be tested on various platforms using as many compilers as possible. Eventually one or more release candidates will be created that will be tested further until the code is deemed stable enough to be released. Both the release candidate(s) and the final release will be tags taken from the release branch. After the release usually more bugs will be found, either by users or by the developers. These should be fixed on the mainline first (if the bug is present there as well) and then the bug-fix should be merged to the release branch. Once sufficient bug fixes have accumulated a new bug-fix roll-up can be released. This is another tag created from the release branch. This process can repeat itself several times until the time has come to prepare for the next major release and create another release branch. Once the next major release is out, there will usually not be any further bug fixing on previous release branches unless there are strong reasons to do so. However, a release branch should never be deleted.

Maintaining a development branch

The first step to create a development branch is to copy the Cloudy mainline to a branch:

svn copy https://svn.nublado.org/cloudy/trunk https://svn.nublado.org/cloudy/branches/some-branch

It is important to copy the entire Cloudy repository, including data files, test suites, and even documentation. That way you have a fully self-contained copy of Cloudy. Don't worry about bloating the repository that way. Copying a file doesn't take up any disk space until you actually modify the file. Now you can check out the new branch and start editing any files you deem necessary for your project. Submit changes to the branch at fairly regular intervals and give meaningful comments with the commit. This helps you (or other people) figure out what the purpose of the changes is when they later review the revisions. This may even be years later, so don't count on "remembering what you have done".

One project, one branch. You should have a clear view of what should be done on the branch, and strictly limit the activity to that. Don't use the branch for a quick fix and then merge it to the mainline. If something can be fixed quickly, do it on the mainline directly and then merge the changes on the mainline into the branch. If it requires lots of work, create a separate branch. The only exception to this would be if the fix needs development done on this branch. In general - avoid two-way merging as much as possible. It will hopelessly confuse you and eventually subversion. So traffic should be only from the mainline to the branch, with one (or maybe a few) major merges from the branch to the mainline. If you merge to the mainline, merge everything, don't do partial merges. That keeps things clean and simple. Don't use the branch as a notepad, it will hate you for it...

While you are working on the branch, development on the mainline will continue and they will gradually start drifting apart. This creates problems when you are finally ready to submit your work to the mainline. If you simply copy the branch over, all the development on the mainline would be lost (including the history of that development). The recommended practice is to merge any changes on the mainline into the branch prior to merging the branch back into the mainline. It is best to do this in installments (at least if development on the branch takes more than a few weeks). Otherwise the complexity of the merge becomes overwhelming and you run an increased risk of losing work done either on the mainline or the branch.

Unfortunately there is (in general) no automatic way of doing a merge. If the work on the mainline only touches files that you are not working on and vice versa, then merging can be automated (and subversion actually does that for you). But if a file was modified on the mainline that you were working on as well, things can get complicated. If the changes were in distinct parts of the file, the merge can still be automated. But if this is not the case subversion will declare a conflict and leaves it up to you to resolve this conflict. This is not because subversion is stupid software, but because it is intrinsically impossible to come up with a reliable solution of the problem. Only the programmers can decide how incompatible changes to the code need to be resolved. See the subversion manual for a more detailed description of how to resolve conflicts. Make sure you understand this section well. Resolving conflicts incorrectly can easily introduce new bugs in the code!

These are the things to look out for when merging from the mainline to your branch. Never do partial merges that leave a gap with the previous merge. Say you are fully merged to 500, then don't do a merge of 550:600. If you later decide that you need the changes in 500:550 after all (and you will need them!), it will hopelessly confuse subversion and generate zillions of conflicts. Don't try to be too smart, just merge everything. That way you are certain that you have everything you need. There are two major exceptions to this rule. First you want to give extra attention to merges that rename files. This is the deal: if a file you were working on gets renamed on the mainline, you will loose your changes if you do a simple merge to the branch (you will get a copy of the file on the mainline rather than a renamed version of your modified file). The second big exception is that you should skip intermediate merges from your branch to the mainline. So if you do an intermediate merge from your branch to the mainline, don't merge that back to your branch later on, for obvious reasons...

One more piece of advise. Avoid changing things by hand. This should really only be an act of last resort. I don't think I ever needed it. The only manual thing I ever did was to delete the working copy and check out a fresh copy when the client got completely confused. You obviously have to manually edit conflicted files, but I never patched any files manually. It is good to avoid this because doing this increases the chance of overlooking certain changes. Machines are much more thorough at this than humans, they don't get tired...

Doing the actual merge

This is how I would do a merge:

1 - make sure your working copy of the branch is in sync with the repository, e.g. through an svn commit of your final changes, or an svn update or by checking out a fresh version of the branch.

2 - merge any development on the mainline into the branch (I will assume here that you are at the root of the working copy of the branch):

svn merge -rxxx:HEAD https://svn.nublado.org/cloudy/trunk .

for xxx you should fill in the revision number when you did the last merge, or the revision that the branch was created out of in case this is the first time you do this. Subversion will take the differences of the files in the mainline between revision xxx and the current head and apply those as patches to the branch (this includes file properties). If you develop a branch for a long time, you should do this on regular intervals, otherwise the complexity of the merge could overwhelm you. svn will now tell you what files are changed. Resolve any merging conflicts that may occur.

There is a --dry-run option that goes through the moves, but doesn't actually change anything.

Your working copy is now in sync with the mainline. Test it if you feel uncertain about the changes that subversion made. Remember that up until now you have not changed anything in the repository, only in your working copy. If you are not completely happy with the result, don't submit! Revert all changes and start over again if necessary!

Caveat 1: if files were added in the merge they will be removed from svn control during the revert, but not physically deleted. This can lead to problems later on as svn refuses to overwrite existing files. You need to manually remove these files. Unfortunately there is no flag in subversion to force it to overwrite the files.

Caveat 2: if you merge a series of revisions where a certain file was modified, and then later deleted or renamed, svn will refuse to delete the file. It will detect that the file has local modifications and it will keep it because it does not want you to loose edits. It will give a rather uninformative warning "Skipped 'path/to/some/file'" instead and the file will remain under version control. If the file was renamed, the file with the new name will already be there. So all you need to do is to manually force the deletion of the unwanted file (after carefully checking that this indeed what should happen!) with this command

svn rm --force path/to/some/file

Only if you are confident that everything is OK submit the changes to the branch

svn commit

3 - Now it is time to merge the changes back onto the mainline. First check out the mainline.

svn switch https://svn.nublado.org/cloudy/trunk

then apply the changes to the mainline

svn merge https://svn.nublado.org/cloudy/trunk https://svn.nublado.org/cloudy/branches/some-branch .

this should not create any conflicts, as they were already resolved in step 2. You now merged all your changes from the branch onto the working copy of the mainline. The files in the working copy and the branch should now have the same content, but they have a different history, so in that sense they are different. Before you commit, carefully review the changes using

svn diff

in the root of the working copy. If you forgot to merge any of the changes from the trunk to the branch, the act of committing now would have the effect of silently reverting those changes. So use the diff output to make sure no changes on the trunk were overlooked before you commit. If you find that some changes were overlooked, revert all changes to your working copy of the trunk, then fix the problem on the branch, commit that fix to the branch and start the merge procedure again. To commit the changes to the mainline use:

svn commit

This completes the process. You can now delete the branch if you wish.

Caveat: you should warn people to refrain from committing to the trunk in between steps 2 and 3. Any postings to the trunk made after the merge in step 2 will be automatically reverted in step 3! Bug fixes can easily be lost that way, so always check for this. A neater way would be to lock the mainline down, but I have not yet found a method for locking a directory.

You can also create tags of the mainline just before and after a major merge in case something goes horribly wrong and you want a quick way to get back to the state before the merge. This is by no means necessary, it's just a convenient way to remember a particular revision... A tag really just is a tag: a mnemonic for a particular revision.

In summary: step 2 makes sure that none of the changes on the mainline are lost by merging them into the branch first. Step 3 essentially copies over the branch to the mainline, but done in such a way that the history stays intact.

File properties

The code is being developed under various operating systems, such as Linux, Windows, OS X, and maybe others. Not all of these systems use the same end-of-line character, which can lead to needless changes of the end-of-line character in the repos and other badness. Luckily, svn allows us to set a special file property called svn:eol-style to native which tells it to adjust the end-of-line character to whatever is needed on the particular system where the working copy resides. This is useful for local tools, but svn then also ignores any changes to the end-of-line character when you commit. This file property is not set by default, so you need to tell svn to do this. Since this is easily forgotten, the best way to do this is to tell svn to do this by default by editing the config file which resides in ~/.subversion/config. You must make sure that enable-auto-props is set to yes, and then under [auto-props] add a number of rules like *.cpp = svn:eol-style=native, which tells svn that every newly committed file with a .cpp extension should automatically get the svn:eol-style property set. There is a lengthy list of file extension that needs to be added to the list, so the best thing to do is to copy the config file that is attached to this page (and edit it if necessary). The config file also automatically sets the execute bit on perl scripts (i.e. files with a .pl extension) and flags .pdf, .psd and .ai files as binary (they are often wrongly detected as they can contain large chunks of ascii text). Note that you need to edit the config file on every machine that you will commit from!

To set this configuration file within TortoiseSVN right click on the repository in the explorer, select TortoiseSVN / settings / general and click the edit button near Subversion configuration file:.

Epilogue

Yes, all of this is horribly confusing at first. But please be patient and reread this page once in a while until you feel confident you understand everything that is written here. If you strictly adhere to the guidelines posted here then merging will be a manageable task. However, if you become to loose in your maintenance, things can quickly spiral out of control!

Last modified 7 years ago Last modified on 2010-10-24T21:27:20Z

Attachments (1)

Download all attachments as: .zip