Migrating SVN repositories to GitHub, maintaining history even on addition/removal of a trunk folder

Friday, 16th July 2021

Years ago I had a few projects hosted in Google Code's SVN repositories, but Google closed Code down in 2015 and though I backed the files up locally via svnsync to carry on working on the projects the code was never made publicly available again.

I decided it would be a good idea to move these repositories to GitHub, but quickly ran into the problem that during the initial migration from my local repository to Google Code's repository I had been forced to move all of the code into a trunk folder (I do not generally use SVN's branching or tagging features) and attempting to synchronise from SVN to Git lost all revision history prior to that change.

After scouring the Internet and many dead ends I came up with the below solution which has worked for me. As far as I can tell the Git tools used are designed for regular synchronisation of repositories and not a one-shot bulk migration. There are separate tools that claim to do a better job with less effort but getting them up and running on my Windows system probably takes longer than figuring out how to do it with Git's own tools...

First of all, you'll need to create a user list that maps your SVN user name(s) (left) to your name and email address (right) used on GitHub, like this:

Ben = Ben Ryves <benryves@benryves.com>
Benjamin = Ben Ryves <benryves@benryves.com>
benryves = Ben Ryves <benryves@benryves.com>
benryves@benryves.com = Ben Ryves <benryves@benryves.com>

Save this list in a text file in your current directory (e.g. users.txt). Next you'll need to need to find the revision number where the move to the trunk folder happened, e.g. by checking the SVN logs. For the sake of this example, let's pretend it was in revision 100. You can now use git svn clone to create a temp copy of the SVN repository. To continue the example, suppose the SVN repository was backed up to D:\SVN\Google\brass-assembler, then the command would look like this:
git svn clone --no-metadata --authors-file=users.txt -r 1:99 file:///D/SVN/Google/brass-assembler temp

There are some notable differences from the usual here, which I'll try to explain:
  • There is no --stdlayout parameter. This is because this makes the assumption that the repository follows the conventional trunk/tag/branch folder used in some SVN repositories, but this repository doesn't make use of such a structure (at least until revision 100!)
  • The -r 1:99 argument limits the operation to be between the revisions 1 and 99. This is when all the files were in the root of the repository, before they were moved in revision 100.
  • The SVN path looks like it's missing a colon, but this is intentional – at least on Windows the Git tools fail if the path contains a colon like it would when using the SVN tools for local paths, so remove that colon.

This may take a while to get started but eventually it should start synchronising the SVN repository to the new Git one in the temp folder up to (and including) the last revision where everything was in the root (99).

Once that has happened, you can perform the trick that makes this work. Open the file temp\.git\config in a text editor and in the [svn-remote "svn"] section you should find the line fetch = :refs/remotes/git-svn. As revisions after this point were moved into a trunk directory, we need to change this to fetch from trunk instead, so change this line to fetch = trunk:refs/remotes/git-svn:

[svn-remote "svn"]
	noMetadata = 1
	url = file:///D/SVN/Google/brass-assembler
	fetch = trunk:refs/remotes/git-svn
[svn]
	authorsfile = C:/Users/Ben/users.txt

Save the file, and then change your current working directory to your temporary repository, fetch the revisions from after the one that moved everything to trunk up to HEAD, then merge the changes (without the merge you'll only see up to the previously-cloned revision 99):
cd temp
git svn fetch -r 101:HEAD
git merge remotes/git-svn

At this point you can clone the temporary repository into a final one, so go up a level, clone the temp repository and delete it:
cd ..
git clone temp Brass3
rmdir /s /q temp

Finally, you can push this local repository to GitHub. When you add a new repository GitHub will provide a clone URL (in the form https://github.com/<username>/<reponame>.git) so change the remote origin for the local repository that was just created and push your changes:
cd Brass3
git remote rm origin
git remote add origin https://github.com/benryves/Brass3.git
git push origin master

Once that has completed you should be able to view your code on the GitHub site with its full history. The last thing I do is to delete the local Git repository and clone again from the remote one on GitHub using the GitHub Desktop program as I am more comfortable using a GUI tool to keep an eye on changes (and I find I'm less likely to mess something up by accidentally mistyping something!)

In case you couldn't tell from the above examples I've also been looking at the source code for my old Brass 3 assembler project. I've been working on a little Z80 assembly project: running BBC BASIC on the Sega Master System, which has involved a lot of my old projects – Brass 3 to assemble the code, Cogwheel to test it, Emerson to handle PS/2 keyboard input and of course my experiences with running BBC BASIC on the TI-83 Plus calculator.

One thing I realised during all this was that Brass 3 had some problems running on 64-bit versions of Windows – the help application is completely non-functional, for example, and crashes silently to desktop. I dug into the code to fix it, only to find out that I'd already done so two years ago, and even got as far as rebuilding the installer package but then just forgot to upload it to the Internet. So, I'm very sorry for the delay, but I have now uploaded "Beta 14" to the Brass 3 page.

FirstPreviousNextLast RSSSearchBrowse by dateIndexTags