-
-
Save pozorvlak/8784840 to your computer and use it in GitHub Desktop.
#!/bin/sh | |
# Suppose you want to do blind reviewing of code (eg for job interview | |
# purposes). Unfortunately, the candidates' names and email addresses are | |
# stored on every commit! You probably want to assess each candidate's version | |
# control practices, so just `rm -rf .git` throws away too much information. | |
# Here's what you can do instead. | |
# Rewrite all commits to hide the author's name and email | |
for branch in `ls .git/refs/heads`; do | |
# We may be doing multiple rewrites, so we must force subsequent ones. | |
# We're throwing away the backups anyway. | |
git filter-branch -f --env-filter ' | |
export GIT_AUTHOR_NAME="Anonymous Candidate" | |
export GIT_AUTHOR_EMAIL="[email protected]"' $branch | |
done | |
# Delete the old commits | |
rm -rf .git/refs/original/ | |
# Delete remotes, which might point to the old commits | |
for r in `git remote`; do git remote rm $r; done | |
# Your old commits will now no longer show up in GitK, `git log` or `git | |
# reflog`, but can still be found using `git show $commit-id`. |
in some cases, it is not completely anonymous
example:
if I do a git pull after committing, the commit title will be "Merge branch 'master' of https://github.com/username/repository"
the repository name still appears if I made a git clone when trying a 'git reflog'
This can solve the issue:
git reflog expire --expire=90.days.ago --expire-unreachable=now --all
Publishing a summarized version:
#!/bin/sh
# Suppose you want to do blind reviewing of code (eg for job interview
# purposes). Unfortunately, the candidates' names and email addresses are
# stored on every commit! You probably want to assess each candidate's version
# control practices, so just `rm -rf .git` throws away too much information.
# Here's what you can do instead.
# Rewrite all commits to hide the author's name and email
for branch in `ls .git/refs/heads`; do
# We may be doing multiple rewrites, so we must force subsequent ones.
# We're throwing away the backups anyway.
git filter-branch -f --env-filter '
export GIT_AUTHOR_NAME="Anonymous Candidate"
export GIT_AUTHOR_EMAIL="[email protected]"
export GIT_COMMITTER_NAME="Anonymous Candidate"
export GIT_COMMITTER_EMAIL="[email protected]"
' $branch
done
# Delete the old commits
rm -rf .git/refs/original/
# Delete remotes, which might point to the old commits
for r in `git remote`; do git remote rm $r; done
# Delete references
git reflog expire --expire=90.days.ago --expire-unreachable=now --all
# Your old commits will now no longer show up in GitK, `git log` or `git
# reflog`, but can still be found using `git show $commit-id`.
# Be aware that merge commit messages often include URLs hinting the original author
Almost perfect! But this script doesnt anonimize commits on tags! For that you have to replace the whole for git filter-branch loop with something like https://github.com/adamdehaven/change-git-author/blob/master/changeauthor.sh#L536, that is:
git filter-branch -f --env-filter '... exports ...' --tag-name-filter cat -- --branches --tags
Now yes!
(correction script taken from https://www.adamdehaven.com/blog/update-commit-history-author-information-for-git-repository/#instructions)
To cleanup the merge commit messages, I've used:
for branch in `ls .git/refs/heads`; do
git filter-branch -f --msg-filter 'sed "s/Merge pull request.*$/Merge pull request #xxx from anonymous_repo/g"' $branch
done
Also be aware of merge branch commits, sometimes branch names includes people usernames:
for branch in `ls .git/refs/heads`; do
git filter-branch -f --msg-filter 'sed "s/Merge branch.*$/Merge branch anon into anon/g"' $branch
done
Thank you for this command. GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL are two more variables you need to clear.