Paludis has been using (and it currently still uses) Subversion to manage its source. We’ve been using git-svn for some months now and recently ciaranm agreed to fully migrate to Git.
To migrate the repository I used my old git-svn clone. This made some stuff a bit trickier, but it was both faster and nicer with pioto‘s server. Things that had to be done:
- Remove ChangeLog and ChangeLog.old.bz2
- Remove metadata added by git-svn
- Rewrite authors and emails since I didn’t use an authors-file
- Remove empty commits (these are commits that only touched ChangeLog or ChangeLog.old.bz2)
This looks like a good task for git-filter-branch. I probably could have done everything in one go, but I decided to do it one at a time.
Since filter-branch is mostly IO-bound, we’ll try to speed it up as much as possible:
$ sudo mount -t tmpfs -o size=100M none paludis/.git-rewrite $ myrefs="0.4 0.6 0.8 0.20 0.24 0.26 ....."
The first task is easy:
$ git filter-branch -f --tree-filter 'git update-index --remove ChangeLog' $myrefs $ git filter-branch -f --tree-filter 'git update-index --remove ChangeLog.old.bz2' $myrefs
To remove metadata created by git-svn I came up with:
tac | sed -n -e '1d' -e '/[^[:blank:]]/,$p' | tac
but dleverton came up with this perl one-liner, and I used it instead:
perl -ne 'print @blanks, $last and undef @blanks if defined $last; if (m/\S/) { $last = $_ } else { undef $last; push(@blanks, $_) }'
I put it in a file and ran:
git filter-branch -f --msg-filter ~/munge-commit-message $myrefs
Changing authors needed a script like the following (with proper mail-addresses):
case ${GIT_AUTHOR_NAME} in
ciaranm) n="Ciaran McCreesh" ; m="foo@bar.com" ;;
spb) n="Stephen P. Bennett" ; m="foo@bar.com" ;;
halcyon) n="Mark Loeser" ; m="foo@bar.com" ;;
allanonjl) n="John N. Laliberte" ; m="foo@bar.com" ;;
steev) n="Stephen Klimaszewski" ; m="foo@bar.com" ;;
kugelfang) n="Danny van Dyk" ; m="foo@bar.com" ;;
ferdy) n="Fernando J. Pereda" ; m="foo@bar.com" ;;
arachnist) n="Robert S. Gerus" ; m="foo@bar.com" ;;
drizzt) n="Timothy Redaelli" ; m="foo@bar.com" ;;
djm) n="David Morgan" ; m="foo@bar.com" ;;
pioto) n="Mike Kelly" ; m="foo@bar.com" ;;
piotr) n="Piotr Rak" ; m="foo@bar.com" ;;
rbrown) n="Richard Brown" ; m="foo@bar.com" ;;
baptux) n="Baptiste Daroussin" ; m="foo@bar.com" ;;
eroyf) n="Alexander Færøy" ; m="foo@bar.com" ;;
compnerd) n="Saleem Abdulrasool" ; m="foo@bar.com" ;;
omp) n="David Shakaryan" ; m="foo@bar.com" ;;
dleverton) n="David Leverton" ; m="foo@bar.com" ;;
peper) n="Piotr Jaroszyński" ; m="foo@bar.com" ;;
dev-zero) n="Tiziano Müller" ; m="foo@bar.com" ;;
zlin) n="Bo Ørsted Andresen" ; m="foo@bar.com" ;;
buildtest) n="Nightly Buildtest" ; m="foo@bar.com" ;;
flameeyes) n="Diego Pettenò" ; m="foo@bar.com" ;;
iluxa) n="Ilya Volynets" ; m="foo@bar.com" ;;
dercorny) n="Stefan Cornelius" ; m="foo@bar.com" ;;
esac
export GIT_AUTHOR_NAME=$n
export GIT_AUTHOR_EMAIL=$m
export GIT_COMMITTER_NAME=$n
export GIT_COMMITTER_EMAIL=$m
git commit-tree "$@"
and ran:
$ git filter-branch -f --commit-filter ~/rewrite-authors $myrefs
Removing empty commits requires a bit more foo:
skip_commit()
{
shift
while [[ -n $1 ]] ; do
shift
map "$1"
shift
done
}
our_tree="$1"
our_parent_tree=$(map $3)
if [[ -z ${our_parent_tree} ]] || [[ -n $(git diff-tree ${our_tree} ${our_parent_tree}:) ]] ; then
git commit-tree "$@"
else
skip_commit "$@"
fi
This one could have just tested whether the current tree is the same as our parent’s tree (that is, no changes were made by this commit):
[[ ${our_tree} == $(git rev-parse $(map $3):) ]]
But it wouldn’t have made a big difference and I noticed it while filter branch was already running something like:
$ git filter-branch -f --commit-filter '. ~/empty-commits.bash' $myrefs
There’s still stuff to do like tags and adding scratch and probably converting the overlay; but the big thing is done. I think that history is stable already, that is, I won’t have to rewrite it again.
It is sitting in my home in bach and will be published soon.
Update: Re-tagging every paludis was the last step. I thought it was going to be cumbersome and boring, however, git makes this kind of stuff pretty easy. Since ciaranm should sign the tags himself, I did:
$ git log --pretty=oneline origin/releases |
> sed -n -e '/^\([0-9a-f]\{40\}\) Tag\( release\)\? \(.*\)/s--\3|\1|Tag release \3-p' \
> > ~/paludis-git-tags
After some hand editing of the file, creating the tags can be done with something like:
$ while read name msg head ; do
> git tag -m "${msg}" ${name} ${head} ;
> done < paludis-git-tags
To checkout an exact version (assuming ~/git/paludis is your repo, doesn’t have to be a local repo):
$ cd somewhere $ git archive --format=tar --remote=~/git/paludis --prefix=paludis- 0.4.0 0.4.0 | tar xf -
— ferdy
[...] For other migration tips (svn) – see here: http://fpereda.wordpress.com/2008/06/11/how-i-migrated-paludis-to-git/ [...]
Pingback by Git Script to Show Largest Pack Objects and Trim Your Waist Line! « Stubbisms – Tony’s Weblog — July 10, 2009 @ 2:07 am
[...] For other migration tips (svn) – see here: http://fpereda.wordpress.com/2008/06/11/how-i-migrated-paludis-to-git/ [...]
Pingback by Linus’ Git Talk Index Git Script to Show Largest Pack Objects and Trim Your Waist Line! Spring Modules Fork Eclipse and IPhone+Google Fail spring-modules-ehcache and ehcache issues you should be aware of Fighting Scala – Scala to Java List Conve — September 27, 2012 @ 10:31 am