Tips to re-structure a project and clean up a git repository
- Create a backup of the project.
- Move binary files out of the repository (e.g. to the "Files" or "Documents" section).
- Remove binary and big files from git history, see below.
- Split the repository into new ones (hw, gw, tst and sw).
- Set sub-modules up in the new repositories (if any).
- Re-organise the folders (if needed) and update hard-coded paths.
Removing big file from git history
1. List big files
git rev-list --objects --all | sort -k 2 > allfileshas.txt
git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+ blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects.txt
for SHA in `cut -f 1 -d\ < bigobjects.txt`; do
echo $(grep $SHA bigobjects.txt) $(grep $SHA allfileshas.txt) | awk '{print $1,$3,$7}' >> bigtosmall.txt
done;
2. Remove file from git history
WARNING, this will change the SHA of the commits where the deleted file was referenced!!*
git filter-branch -f --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch MY-BIG-DIRECTORY-OR-FILE' --tag-name-filter cat -- --all
3. Final clean-up
rm -rf .git/refs/original/ && git reflog expire --all --expire-unreachable=0 && git gc --aggressive --prune && git repack -A -d && git prune
4. Check the result
du -sh