rebase - rewriting history
Rebase is one of the most powerful feature of git, and probably most dangerous one too. As it rewrites history, you have to ensure the history you'll be writing makes sense, and doesn't loose some bits. And when you rewrite history, you risk breaking the repository for other users. So first, a warning, and then a discussion on why you should rebase whenever you pull.
CAUTION: Once you push to a remote repository, other users may be using the results. If you then rebase, you are rewriting history and you will break the commit pointers for the remote users. it's best to never rebase once you've pushed to a remote repository. If you do have to do this, coordinate with other users so that they can deal with the resulting issues.
rebase - the dangerous version:
If you happen to actually loose some things,
git reflog
might help you find them back.
Anyway, back to rebase. Imagine you have this bunch of commits you want to publish, but hey, the first one on your pile needs a tweaking in the comment.
So you have:
master -> A -> B -> C
trunk is what's in the gitub repository, and A, B and C are your local commits. A needs some tweaking. Easy:
- Tweak what you want, and write it how you'd like it
- Commit it, this will give you D
- Rewrite history up to A using interactive rebase:
git rebase -i HEAD~4
(or you could use git rebase -i A, if A is a SHA1)
This should open your favorite editor with this:
pick A Item...: the comment for A
pick B Item...: the comment for B
pick C Item...: the comment for C
pick D blah
So edit this list, and rewrite it like that:
pick A
squash D
pick B
pick C
This will squash (merge) commits A and D into one commit, and leave the rest as is, which is exactly what we want!
Now, you just have to push, and nobody will ever know about the original version of A.
As you'll be squashing commits, it will ask you for a commit message of the newly created merge commit. Usually, you'll want to keep the first one only, but you're free.
Again,
Don't do this if you've already pushed A to a remote repository
rebase - why it should be the default
Lets consider another common situation You've done a "git checkout master" and have been developing / committing away. when you first started, origin/master (out on github) was up to commit "c". You've added d, e & f.
master* a - b - c - d - e - f ..
origin/master a - b - c
However, while you are working on d, e, & f, someone else has been working and pushed o, p, q out to github.
master* a - b - c - d - e - f ..
origin/master a - b - c (your local copy)
origin/master a - b - c - o - p - q (out on github - due to another developer "push")
when you then issue "git pull" git has to figure out how to combine your commits, d, e, f with someone elses o, p & q. Default behavior is to "merge" the origin/master onto your master copy.
master* a - b - c - d - e - f - opq (merge commit)
/ | \
origin/master a - b - c - o - p - q (out on github - due to another developer "push")
then you push, and end up with
master* a - b - c - d - e - f - opq
origin/master a - b - c - d - e - f - opq
The end result is history was still rewritten
the other way to do this is to tell git to use rebase as the strategy.
git pull --rebase origin/master
That tells git to rebase your local master branch onto the origin/master during the pull
master* a - b - c ^ d - e - f
\ | /
origin/master a - b - c - o - p - q
Now when you push, you append your work onto the remote repository rather than rewriting it's history. This is actually what we really want in most cases. It keeps the remote history as stable as possible. Rather than forgetting to type --rebase every time you pull, the thing to do is to make it the default behavior.
- Set it globally, so any new branche will auto configure the new rebase behavior
- And set it for each branch you currently have:
git config --global branch.autosetuprebase always
git config branch.master.rebase true
git config branch.Release01x01.rebase true
Bisect - finding a commit that broke something
Git offers a really powerful tool: bisection!
When you have something that has changed, and you want to know which commit did it, then you'll need bisect.
Here is an example I used today, but I've used it many times to find the root cause of a failing unit test.
Foswiki:Main.MichaelDaum found out that his
NatSkinPlugin was missing the dependencies field in the topic. Normally, this is just a matter of adding a %$DEPENDENCIES% inside the plugin's topic, thus the idea was to find out when this was lost.
So bisect!
- First, tell git you want to bisect:
git bisect start
- Then, tell it your current code is bad:
git bisect bad
- Pick a checkout where the code was known to be good. Here it was tricky, thus we simply picked the first checkin of this plugin:
PAGER=tail git log NatSkinPlugin
a more clever way to do this is:
git rev-list --topo-order HEAD NatSkinPlugin|tail -1
- Checkout this code (here both command on my git returned b971f71abc07e398ff6e0449991da21a2aacc263 as the revision):
git checkout b971f71abc07e398ff6e0449991da21a2aacc263
grep -r %\$DEPENDENCIES% NatSkinPlugin/
git bisect good
This will output some text, the number or revisions left to test, and the current one it's testing, and it will checkout you to the one that needs testing.
git bisect run grep -r %\$DEPENDENCIES% NatSkinPlugin/
This should output the information about the last commit:
a81d4f3479b6e3fd2dc2934a90c1a0f1985e9220 is first bad commit
commit a81d4f3479b6e3fd2dc2934a90c1a0f1985e9220
Author: MichaelDaum
Date: Thu Jan 8 10:52:53 2009 +0000
Item697: reverted SYSTEMWEB to TWIKIWEB for foswiki/compat plugins
git-svn-id: http://svn.foswiki.org/trunk@1871 0b4bb1d4-4e5a-0410-9cc4-b2b747904278
:040000 040000 c862145efc23fce6dc7f88ca2ef03389b0482ea5 b719f8955685d5e9837af30c31696ae567493e63 M NatSkinPlugin
You may then review that
NatSkinPlugin:a5792327056a was indeed the bad commit.
A couple of additional hints:
- You can run a unit test to determine if it fails during the bisect process. However to issue the bisect good or bad command, you must be at the top of the checkout. With a local script to pop to the top of the checkout, issue bad or good, and return for the next test run.
- bisect won't deal well with submodule type projects or separate pseudo-installed extensions. You would need to issue the bisect commands within each repository.
Managing local content in a git based test system
One issue when running from a git clone, is that changes to the wiki, registered users, test topics, etc. are all mixed in with git managed content. It's rather easy to make a couple of mistakes:
- accidentally committing local content to the foswiki repository
- accidentally cleaning too aggressively and losing all your local content.
Here is a way to save all local content in an "extension" so that it survives gitclean and will be less likely to commit upstream.
Creating the initial test system.
Assume that your running web checkout is named
foswiki
and located under
/var/www
. These steps install the foswiki distribution, and create a new starter extension from the template contrib.
cd /var/www
git clone git@github.com:foswiki/distro foswiki
cd foswiki/core
./pseudo-install developer
./pseudo-install EmptyContrib
cd ..
core/create_new_extension.pl LocalContentContrib
<.. answer the questions to create your new contrib ..>
Populate the extension with your local content
I had already organized my local content to separate it from foswiki topics:
- Copied the "Main" web to a local "Usersweb" to use for registering users
- Copied the "Sandbox" web to "Litterbox" for test topics
- Create other webs as desired.
To populate the LocalContentContrib:
- move your test webs from
core/data
to the LocalContentContrib/data
directory
- move the test pub directories as well.
- For empty or new pub directories, create the directory and add a
.keep
file
- move your
data/.htpasswd
file to the LocalContentContrib/data directory
- move your
lib/LocalSite.cfg
file to the LocalContentContrib/lib directory
- Other files you might include
bin/LocalLib.cfg
...
- Create a MANIFESTfile.
- For each new web, all that's needed is the "WebPreferences.txt" file. pseudo-install will link at the directory level.
- Pub directories need special treatment, as git doesn't deal well, so add the
pub/<directory>/.keep
to the MANIFEST
- Add in .htpasswd, LocalSite.cfg, and any other individual files you moved.
Configure Foswiki to use your content.
Some of the useful settings include:
$Foswiki::cfg{UsersWebName} = 'Usersweb';
$Foswiki::cfg{SandboxWebName} = 'Litterbox';
Create an empty repository for your content
This assumes you will create a local git repository in the
/home/git
directory. Create a completely empty repository. Content will come later from a push.
cd /home/git
mkdir LocalContentContrib
cd LocalContentContrib
git --bare init
Initialize the repository in your www directory and make it ready to push.
cd /var/www/foswiki/LocalContentContrib
git init .
git add .
git commit -a -m "Initial commit for local content"
Link the repositories together and push your content: (still in /var/www/foswiki/LocalContentContrib
)
git remote add origin file:///home/git/LocalContentContrib
git branch -u origin/master master
git push