Extracting Subversion Branch Mappings

Datetime:2016-08-23 00:34:04          Topic: ZooKeeper           Share

Subversion was a huge relief, compared to CVS when it came out. I was pleased, at Codehaus ( RIP ) to be able to use it long before it got to v1.0. Of course, it was chasing Perforce’s feature set too, and wasn’t just about surpassing CVS . At least, it seemed so to me, as I’d been using Perforce in the late 90’s.

Google just disclosed that they’re no longer using Perforce for their mega-trunk/monorepo. One can only guess that the sheer scale that Perforce was designed to solve is good enough for just about every company in the world, bar Google.

Anyway, things Subversion and Perforce have in common, that Git doesn’t have by default, that is quite useful from a governance point of view:

  • permissions per-branch and per-directory
  • won’t choke on large binaries
  • won’t choke on weight of history (not even for churn on large binaries)
  • the ability to subset a checkout (client-spec in P4, and sparse-checkouts in Svn)
  • arbitrary branch mappings – though subversion’s are not as fine-grained as Perforce’s

On that last, branch mappings in Perforce share the configuration grammar as client-specs, making it easy for one to shadow another. Refer Googlers Subset their Trunk for more infomation.

OK, enough of the bullety comparison. Subversion doesn’t have an examinable list of branch mappings. Yet. What I’d really want is svn branch-mappings to spit out what I’m looking for, or better still a generated file “.branch_mappings.txt” for me to peruse. With the help of the subversion developer mail-list, I am able to reverse engineer the latter. Make this into a Bash script:

svn up -q
repoRoot=$(svn info | grep "^URL: " | sed "s/URL: //")
rm -f .branch_mappings.txt
for f in *; do 
  svn ls "$repoRoot/$f" --depth immediates > .immediates
  while read p; do
    svn -q log --verbose --stop-on-copy "$repoRoot/$f/$p" -r1:HEAD -l1 2>&1 | grep " (from " | grep  "^   A " | sed "s/^   A //" | sed "s/ (from /::/" | sed "s/:[[:digit:]]*)$//" >> .branch_mappings.txt
  done <.immediates
done
cat .branch_mappings.txt | sort | uniq | sponge .branch_mappings.txt
svn add .branch_mappings.txt
svn commit -m "branch mappings changed"

(make sure you installed “sponge” first)

INSIDE a checkout performed with svn co REPO-ROOT subversion --depth immediates run that script nightly via Jenkins.

If you’re making Jenkins perform the checkout/update, make sure it obeys the --depth parameter as you really don’t want to pointlessly fill a hard drive with all permutations of branch, tag, dir and file. There’s a limitation – if you made a branch svn copy FROM_URL TO_URL deeper into the dir structure than two deep from root, the above script isn’t going to find it to list it.

Here is the branch mappings for Zookeeper ( svn co http://svn.apache.org/repos/asf/zookeeper/ --depth immediates ) as made by the above script. It took 90 seconds to execute after the initial checkout.

/zookeeper/branches/branch-3.4::/zookeeper/trunk
/zookeeper/branches/branch-3.5::/zookeeper/trunk
/zookeeper/branches::/hadoop/zookeeper/branches
/zookeeper/dist::/hadoop/zookeeper/dist
/zookeeper/legacy-site::/hadoop/zookeeper/site
/zookeeper/logo::/hadoop/zookeeper/logo
/zookeeper/tags/release-3.3.3-rc0::/zookeeper/branches/branch-3.3
/zookeeper/tags/release-3.3.3::/zookeeper/tags/release-3.3.3-rc1
/zookeeper/tags/release-3.3.4::/zookeeper/tags/release-3.3.4-rc0
/zookeeper/tags/release-3.3.5-rc0::/zookeeper/branches/branch-3.3
/zookeeper/tags/release-3.3.5::/zookeeper/tags/release-3.3.5-rc1
/zookeeper/tags/release-3.3.6::/zookeeper/tags/release-3.3.6-rc0
/zookeeper/tags/release-3.4.0-rc0::/zookeeper/branches/branch-3.4
/zookeeper/tags/release-3.4.0-rc1::/zookeeper/branches/branch-3.4
/zookeeper/tags/release-3.4.0::/zookeeper/tags/release-3.4.0-rc2
/zookeeper/tags/release-3.4.1::/zookeeper/tags/release-3.4.1-rc0
/zookeeper/tags/release-3.4.2::/zookeeper/tags/release-3.4.2-rc0
/zookeeper/tags/release-3.4.3::/zookeeper/tags/release-3.4.3-rc0
/zookeeper/tags/release-3.4.4::/zookeeper/tags/release-3.4.4-rc0
/zookeeper/tags/release-3.4.5-rc0::/zookeeper/branches/branch-3.4
/zookeeper/tags/release-3.4.5::/zookeeper/tags/release-3.4.5-rc1
/zookeeper/tags/release-3.4.6::/zookeeper/tags/release-3.4.6-rc0
/zookeeper/tags/release-3.5.0::/zookeeper/tags/release-3.5.0-rc0
/zookeeper/tags/release-3.5.1-rc0::/zookeeper/branches/branch-3.5
/zookeeper/tags/release-3.5.1-rc1::/zookeeper/branches/branch-3.5
/zookeeper/tags/release-3.5.1-rc2::/zookeeper/branches/branch-3.5
/zookeeper/tags/release-3.5.1-rc3::/zookeeper/branches/branch-3.5
/zookeeper/tags/release-3.5.1::/zookeeper/tags/release-3.5.1-rc4
/zookeeper/tags::/hadoop/zookeeper/tags
/zookeeper/trunk::/hadoop/zookeeper/trunk

Google and Facebook are allegedly cooperating on modifications to Mercurial to allow it to operate in the mega-trunk/mono-repo design. It is always fascinating to see competition in the SCM space :)





About List