HgPerDeveloperBranch

Contents

Intro

The Git plugin for Hudson does something related to what we do for team repositories as per HgParallelTeamIntegration:

"Commits should now be automatically merged with the integration branch (they will fail if they do not merge cleanly), and built. If the build succeeds, the result of the merge will be pushed back to the remote git repository."

You could imagine doing something similar for Mercurial, even one branch per developer. It would probably work best in that case to use named branches, as these are much lighter weight than clones. (bug #190537) So I commit only on the branch named jglick, and when I was ready I would run

hg push -r jglick main

(The -r jglick can be defaulted in hgrc of course, as well as the repo, so I would really only need to type hg push.)

Hudson would notice new changesets on a few people's branches, e.g. using

hg heads --template '{branches}\n' | egrep .

and try to do something like the following (aborting on any error of course):

hg pull
hg up -C default
for developer in jglick jtulach ...
do
  if hg merge $developer
  then
    hg ci -m "accepting $developer's changes in trunk #$build_number"
  else
    hg up -C default
  fi
done
ant
hg push

I think this would have to be done as a Hudson plugin, at a minimum because you need to control the displayed changelog according to which branches were actually merged. (Doing the merges as part of the "build command" would not accomplish this.) Probably the plugin would supply a variant of the Hg SCM provider, and could also hook into reporting if desired.

As a developer I would not need to merge before pushing, since I would be the only one to commit to my personal jglick branch; I could merge into the branch from main-golden at my leisure. (If I am too lazy about merging into my branch, eventually my changesets would start to get rejected as either causing merge conflicts or causing other semantic problems when combined with newer sources from other developers.) I could even merge from other developer's branches if I knew I needed to build on top of their work immediately - and I trusted their changes not to break the build.

For security, the server could enforce that you can only push to your own branch (i.e. HTTPS user ID must match branch name) and/or that commits to a named branch are marked as having been done by that user.

It would be easy to see in history which Hudson build got whose stuff, since the default branch in main would consist solely of automated merges labeled with build number (for which hg log -p would show a meaningful diff). The repository could even be configured to reject attempts to push to the default branch by anyone but the Hudson job and some very trusted people.

You could imagine some heuristics to maximize average throughput (= changesets pushed to default branch per successful build, or zero if build fails):

  1. Skip branches which fail to merge cleanly. (With merge conflicts, or even just supposedly nonconflicting changes to the same file.) Illustrated in the above pseudo-script.
  2. Limit the number of branches to be merged in a given build (other branches can wait for another build).
  3. Give priority to developers who have not historically broken the build often.
  4. Run branches with big changes in isolation.
  5. Divide branches between a few slave machines running in parallel (though this complicates the logic of the final push).

One issue with such a system is that it would require a standardized test suite considered good enough for integration to main by everyone. Currently we have a dual standard: a lot of general UI tests run by trunk, and a lot of domain-specific tests (esp. unit tests) run by team builders. I'm not sure what the best compromise would be. Running all the tests in the trunk build might just make it too slow (and unreliable). Of course we already have an imperfect system: a commit can make it into main-golden which breaks a unit test only checked by a team builder. Probably we need a validation suite which should always be passing in main (modulo random failures), plus a much larger suite run by a continuous builder where failures are evaluated but do not halt build promotion.

Basic workflow

Assuming you work only in a single clone, the initial setup would be:

  1. hg clone http://hg.netbeans.org/main-golden/ myrepo
  2. cd myrepo
  3. Set up .hg/hgrc:
  [Paths]
  default = http://hg.netbeans.org/main-golden/
  default-push = http://hg.netbeans.org/main/
  [Defaults]
  push = -r jhacker
  
  1. hg branch jhacker (or hg up -C jhacker if your branch exists)

and regular workflow would be:

  1. Synchronize:
    1. hg pull
    2. hg merge default
    3. hg commit -m "synchronizing with main"
  2. Develop:
    1. edit files
    2. hg diff
    3. hg commit
    4. (repeat step #2 as usual)
  3. hg push
  4. (repeat #2 + #3 for usual work)
  5. (go back to #1 when desired)

If you use multiple local clones it would be a bit different, but not much. The point is that you are normally working in your personal named branch only.

One nice advantage not mentioned previously: if you pass specific file or directory names to local commands like hg diff and hg commit, and only synchronize (step #1) occasionally, you will rarely be doing any operations that require the complete working copy to be scanned for changes. This is important because scanning the whole working copy is one of the slower operations that consumes a lot of disk cache. By contrast, in the current setup with everyone using the default branch, many if not most pushes require a merge first, which forces a full scan to make sure there are not uncommitted changes.

Furthermore, in step #2 it is no problem to commit only certain files and leave other modifications outstanding; you can still push in step #3 with local modifications. (Obviously it is up to you to ensure that the unpushed modifications are not required for the build to pass! But that would be true in CVS/SVN too.) This would make it easier to push a quick, simple fix to one file while still working on unrelated changes, reducing the need to keep multiple local clones.

Diagnosing failed builds and inspecting history

The changelog for the build would list all the branch changesets which were included as well as the merge changesets created by the job, so I think it should be clear both whose changes were included and what those changes were. You can also retrieve this information pretty easily from log output.

As a bonus, the Hudson job could add the build number as a tag, marking the sources that went into the build. (If the build failed, this tag as well as the automated merge changesets would simply be discarded.)

If there had been any direct pushes to main since the last build (e.g. emergency fixes, RE build script updates, etc.) then these would also need to be included in the changelog.

Logging in a simulated repository

$ hg glog
@  changeset:   17:961c6a446f2f
|  tag:         tip
|  user:        Hudson
|  date:        Wed Aug 06 18:37:50 2008 -0400
|  summary:     Added tag trunk-3 for changeset 6faa157e7bdb
|
o    changeset:   16:6faa157e7bdb
|\   tag:         trunk-3
| |  parent:      12:a997098ce1a8
| |  parent:      15:4e25228eb2d8
| |  user:        Hudson
| |  date:        Wed Aug 06 18:37:50 2008 -0400
| |  summary:     accepting dkonecny's changes in trunk #3
| |
| o  changeset:   15:4e25228eb2d8
| |  branch:      dkonecny
| |  user:        David Konecny <dkonecny@netbeans.org>
| |  date:        Wed Aug 06 18:37:50 2008 -0400
| |  summary:     web3
| |
| o  changeset:   14:30397beb7f2f
|/|  branch:      dkonecny
| |  parent:      13:b80fa86fa309
| |  parent:      12:a997098ce1a8
| |  user:        David Konecny <dkonecny@netbeans.org>
| |  date:        Wed Aug 06 18:37:50 2008 -0400
| |  summary:     synch
| |
| o  changeset:   13:b80fa86fa309
| |  branch:      dkonecny
| |  parent:      4:13caed993cb1
| |  user:        David Konecny <dkonecny@netbeans.org>
| |  date:        Wed Aug 06 18:37:50 2008 -0400
| |  summary:     web2
| |
o |  changeset:   12:a997098ce1a8
| |  user:        Hudson
| |  date:        Wed Aug 06 18:37:49 2008 -0400
| |  summary:     Added tag trunk-2 for changeset 2bd6b7451fde
| |
o |    changeset:   11:2bd6b7451fde
|\ \   tag:         trunk-2
| | |  parent:      7:5b03f3548526
| | |  parent:      10:88f874626c37
| | |  user:        Hudson
| | |  date:        Wed Aug 06 18:37:49 2008 -0400
| | |  summary:     accepting jglick's changes in trunk #2
| | |
| o |  changeset:   10:88f874626c37
| | |  branch:      jglick
| | |  user:        Jesse Glick <jglick@netbeans.org>
| | |  date:        Wed Aug 06 18:37:49 2008 -0400
| | |  summary:     core4
| | |
| o |  changeset:   9:00ee960b6fa1
|/| |  branch:      jglick
| | |  parent:      8:4aed646024cc
| | |  parent:      7:5b03f3548526
| | |  user:        Jesse Glick <jglick@netbeans.org>
| | |  date:        Wed Aug 06 18:37:49 2008 -0400
| | |  summary:     synch
| | |
| o |  changeset:   8:4aed646024cc
| | |  branch:      jglick
| | |  parent:      3:9faecd2554c4
| | |  user:        Jesse Glick <jglick@netbeans.org>
| | |  date:        Wed Aug 06 18:37:49 2008 -0400
| | |  summary:     core3
| | |
o | |  changeset:   7:5b03f3548526
| | |  user:        Hudson
| | |  date:        Wed Aug 06 18:37:49 2008 -0400
| | |  summary:     Added tag trunk-1 for changeset 7970aba23b8c
| | |
o---+  changeset:   6:7970aba23b8c
| | |  tag:         trunk-1
| | |  parent:      5:18c3415ce442
| | |  parent:      4:13caed993cb1
| | |  user:        Hudson
| | |  date:        Wed Aug 06 18:37:49 2008 -0400
| | |  summary:     accepting dkonecny's changes in trunk #1
| | |
o | |  changeset:   5:18c3415ce442
|\| |  parent:      1:fd91040644e2
| | |  parent:      3:9faecd2554c4
| | |  user:        Hudson
| | |  date:        Wed Aug 06 18:37:48 2008 -0400
| | |  summary:     accepting jglick's changes in trunk #1
| | |
| | o  changeset:   4:13caed993cb1
| | |  branch:      dkonecny
| | |  parent:      0:0d9c32811032
| | |  user:        David Konecny <dkonecny@netbeans.org>
| | |  date:        Wed Aug 06 18:37:48 2008 -0400
| | |  summary:     web1
| | |
| o |  changeset:   3:9faecd2554c4
| | |  branch:      jglick
| | |  user:        Jesse Glick <jglick@netbeans.org>
| | |  date:        Wed Aug 06 18:37:48 2008 -0400
| | |  summary:     core2
| | |
| o |  changeset:   2:1bb6f32603b0
|/ /   branch:      jglick
| |    user:        Jesse Glick <jglick@netbeans.org>
| |    date:        Wed Aug 06 18:37:48 2008 -0400
| |    summary:     core1
| |
o |  changeset:   1:fd91040644e2
|/   user:        Hudson
|    date:        Wed Aug 06 18:37:47 2008 -0400
|    summary:     Added tag trunk-0 for changeset 0d9c32811032
|
o  changeset:   0:0d9c32811032
   tag:         trunk-0
   user:        Michal Zlamal <mzlamal@netbeans.org>
   date:        Wed Aug 06 18:37:47 2008 -0400
   summary:     start


$ hg log -r trunk-1:trunk-0 -M -X .hgtags
changeset:   4:13caed993cb1
branch:      dkonecny
parent:      0:0d9c32811032
user:        David Konecny <dkonecny@netbeans.org>
date:        Wed Aug 06 18:37:48 2008 -0400
summary:     web1

changeset:   3:9faecd2554c4
branch:      jglick
user:        Jesse Glick <jglick@netbeans.org>
date:        Wed Aug 06 18:37:48 2008 -0400
summary:     core2

changeset:   2:1bb6f32603b0
branch:      jglick
user:        Jesse Glick <jglick@netbeans.org>
date:        Wed Aug 06 18:37:48 2008 -0400
summary:     core1

changeset:   0:0d9c32811032
tag:         trunk-0
user:        Michal Zlamal <mzlamal@netbeans.org>
date:        Wed Aug 06 18:37:47 2008 -0400
summary:     start


$ hg log -r trunk-2:trunk-1 -M -X .hgtags
changeset:   10:88f874626c37
branch:      jglick
user:        Jesse Glick <jglick@netbeans.org>
date:        Wed Aug 06 18:37:49 2008 -0400
summary:     core4

changeset:   8:4aed646024cc
branch:      jglick
parent:      3:9faecd2554c4
user:        Jesse Glick <jglick@netbeans.org>
date:        Wed Aug 06 18:37:49 2008 -0400
summary:     core3


$ hg log -r trunk-3:trunk-2 -M -X .hgtags
changeset:   15:4e25228eb2d8
branch:      dkonecny
user:        David Konecny <dkonecny@netbeans.org>
date:        Wed Aug 06 18:37:50 2008 -0400
summary:     web3

changeset:   13:b80fa86fa309
branch:      dkonecny
parent:      4:13caed993cb1
user:        David Konecny <dkonecny@netbeans.org>
date:        Wed Aug 06 18:37:50 2008 -0400
summary:     web2

On named branches

We should not use named branches right now; this proposals suggests how they could be used.

There have been some problems with named branches, mainly at the UI level.

An example is running hg merge with no revision argument: this used to abort (asking for a specific revision) in case there were two heads on your current branch plus some other named branches. You would need to explicitly specify the head to merge with. This problem has been recently fixed: now only heads on the current branch are considered as a default argument for merging.

hg fetch also did not behave nicely in the presence of named branches last I checked; again this might be getting fixed now. Of course this is just a convenience command, and the proposed workflow would not involve use of this command for the most part anyway.

Some logging commands may not take advantage of the branch name information as well as you would like; this mostly falls into the category of RFE. Recent development of Hg has added some useful options, e.g. hg log -b jglick would show only changesets made on my branch.

Parallel integration algorithms

In order to maximize throughput, the builder machine can make some educated guesses as to what to try to merge in what order and using what grouping. The actual algorithm here could be complicated, but it is clearly separated from the rest of the process - at worst, throughput goes down and the algorithm needs to be tweaked. The trickiness is that it is not generally possible for a machine to determine which changesets are responsible for a failure, other than by bisecting. It might be possible to examine test failures and error messages from the last compiler run and try to correlate these with modifications to source code, but this is not trivial.

On a builder that can only run one build at a time (no slaves available), a simple but reasonable algorithm might be:

  1. Maintain a queue (FIFO) of unmerged branches.
  2. At the start of each build, check for branches with new changesets that are not already in the queue; add them to the end.
  3. Pick some (non-strict) subset of the start of the queue to try to merge. Limit by number of changesets and/or number of files modified (but always pick at least one branch). If an attempt to merge a branch results in merge conflicts, cancel the merge and push the branch to the end of the queue, but continue looking for candidates.
  4. Try the build. If it succeeds, push the result. If it fails, push all the merged branches back onto the end of the queue.

Failures of course send out email, so the culprit can just fix the problem in their branch and push the fix; if all goes well, the fix will be available by the time the branch moves back to the front of the queue.

If you have slaves available then you should be able to improve throughput. One way is to simply use an algorithm like the one above except parallelized over all the slaves, sharing a common branch queue (i.e. more or less round-robin merging). The issue is that the final push in this case may need to merge with pushes from other slaves, which can on occasion introduce integration failures. (Of course if you permit direct integration to the default branch by some people then you also have this issue, though if such integrations are unusual then it would be reasonable for a single-machine builder to just abort a build in case it is unable to push - it would pick up the manual integration on the next build and try again.)

Another slave-based algorithm, based on a single master builder this time (thus protecting against integration bugs), might be

  1. Maintain a set of validated branches, initially empty, plus a set of working branches.
  2. On each slave, when a build is ready to start,
    1. Check for branches with new changesets which are not in either list. If some are found, add to the working list.
    2. Try to merge all the branches just added to the working list. If some merges fail, skip them and immediately remove them from the working list.
    3. If there have been any branches merged, run a build. If it fails, just remove the merged branches from the working list. If it succeeds, move the branches into the validated list.
  3. On the master builder, when a build is ready to start, check the validated list.
    1. Try to merge all the branches from the validated list. If a merge fails, skip it and remove the branch from the list.
    2. If there have been any branches merged, run a build. If it fails, do nothing.
    3. If the build succeeded, push the result, and also remove all the merged branches from the validated list.

These algorithms may need some further refinements to deal with updates to a branch which is currently being built, etc.

As before, such algorithms could be modified to give preference to branches with few historical failures (i.e. cautious developers), etc.

Not logged in. Log in, Register

By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2012, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo