At the end you use `git bisect`
Sometimes, the difficulty begins only when you've identified the bad commit.
You may have to further bisect the space to find the culprit.
The following situation can occur in a compiler: something is changed (e.g. optimizer). The commit breaks as a result. But you have no idea where. Obviously, something is wrong with the optimization change, but from looking at the code, it's not obvious. The compiler is self-hosting, so it compiles itself as well as the library of the language, and then tests are run with everything. The optimization change miscompiled something in the compiler. That something then broke the compiler in a way that is miscompiled something in a run-time function, which then broke something else.
To find the first miscompile, you can do the following binary search.
Have the compiler hash the names of all top-level definitions that it is compiling across the system, say to a 64 bit value.
Then in the code where the change was introduce, put an if/else switch: if the low N bits of the hash code are equal to a certain value, then optimize with the new logic. Otherwiwse use the old logic.
We start with N=1 (1 bit) and the value 0. All definitions whose hash code ends in a 0 bit are subject to the broken new logic; those with 1 are subject to the old logic.
Say that the bug reproduces. Then we keep the 0, and raise N to 2 for a two-bit mask. We try 00: all hashes ending with 00 use new optimization. Those ending in 10, use old. This time the problem doesn't reproduce. So we stick with 10. (And can validate that the problem now reproduces). We raise N to 3, and follow the same procedure.
By doing this we reveal enough bits of the hash code to narrow it down to one definition (e.g. function): when that one specific function is subject to the compiler change, all hell breaks loose, otherwise not.
From there we can analyze it: see how that definition is compiled before and after the mod, and what makes it break. From there, it is still hard work, but much easier to deduce why the optimization change is wrong and possibly how to fix it.
I've successfully employed this technique several times on the TXR Lisp project to debug compiler issues.
git bisect is great when it works; but you will come across things that cannot be found with git bisect.
I've debugged things like this: a bug was introduced, but with no manifestation. Eventually, many commits later, some unrelated change triggers it. But this comes and goes. Some changes make the manifestation go away, and some changes make it reappear.
Git bisect is predicated on the bug not existing at the "good" end point and making a single appearance between that and the "bad" end of the range. It has allowance for commits not being testable; you can skip those. If the bad commit is one of the skipped ones, I think it tells you.
Try not to have any other kind of bug. :)
I used git bisect in anger for the first time recently and it felt like magic.
Background: We had two functions in the codebase with identical names and nearly identical implementations, the latter having a subtle bug. Somehow both were imported into a particular python script, but the correct one had always overshadowed the incorrect one - that is, until an unrelated effort to apply code formatting standards to the codebase “fixed” the shadowing problem by removing the import of the correct function. Not exactly mind bending - but, we had looked at the change a few times over in GitHub while debugging and couldn’t find a problem with it - not until we knew for sure that was the commit causing the problem did we find the bug.
I've used bisect a few times in my life. Most of the time, I already know which files or functions might have introduced a bug.
Looking at the history of specific files or functions usually gives a quick idea. In modern Git, you can search the history of a specific function.
>> git log -L :func_name:path/to/a/file.c
You need to have a proper .gitattributes file, though.Make sure you know about exit code 125 to your test script. You can use it in those terrible cases where the test can't tell, one way or another, whether the failure you seek happened, for example when there is an unrelated build problem.
I wrote a short post on this:
One place bisect shines is when a flaky test snuck in due to some race condition but you can’t figure out what. If you have to run a test 100000 times to be convinced the bug isn’t present, this can be pretty slow. Bisecting makes it practical to narrow in on the faulty commit, and with the right script you can just leave it running in the background for an hour.
I recently used git bisect to help find the root cause of a bug in a fun little jam of mine (a music player/recorder written in Svelte - https://lets-make-sweet-music.com).
My scenario with the project was:
- no unit/E2E tests - no error occurring, either from Sentry tracking or in the developer tools console. - Many git commits to check through as GitHub's dependabot alerts had been busy in the meantime.
I would say git bisect was a lifesaver - I managed to trace the error to my attempt to replace a file I had with the library I extracted for what it did (http://github.com/anephenix/event-emitter).
It turns out that the file had implemented a feature that I hadn't ported to the library (to be able to attach multiple event names to call the same function).
I think the other thing that helps is to keep git commits small, so that when you do discover the commit that breaks the app, you can easily find the root cause among the small number of files/code that changed.
Where it becomes more complex is when the root cause of the error requires evaluating not just one component that can change (in my case a frontend SPA), but also other components like the backend API, as well as the data in the database.
> People rant about having to learn algorithmic questions for interviews. I get it — interview system is broken, but you ought to learn binary search at least.
Well, the example of git bisect tells you that you should know of the concept of binary search, but it's not a good argument for having to learn how to implement binary search.
Also just about any language worth using has binary search in the standard library (or as a third party library) these days. That's saner than writing your own, because getting all the corner cases right is tricky (and writing tests so they stay right, even when people make small changes to the code over time).
Git has some really good tools for searching code and debugging. A few years ago I wrote a blog post abot them, including bisect, log -L, log -S and blame. You can see it and the discussion here: https://news.ycombinator.com/item?id=39877637
`git-bisect` is legit if you have to do the history archaeological digging. Though, there is the open question of how git commit history is maintained, the squash-and-merge vs. just retain all history. With squash-and-merge you're looking at the merged pull-request versus with full history you can find the true code-level inflection point.
Honestly, after 20 years in the field: optimising the workflow for when you can already reliably reproduce the bug seems misapplied because that's the part that already takes the least amount of time and effort for most projects.
When I learned about git bisect I thought it was a little uppity. I thought it was something I would never use in a practical scenario. Working on large code bases. However, sometimes a bug pops up and we don't know when it started. We use git bisect not place blame on a person, but to try to figure out when the bug was no longer there so we know what code introduced it. Yes, clean code helps. Sometimes git bisect is really nice to have.
I agree with the post.
I also think that typically if you have to resort to bisect you are probably in a wrong place. You should have found the bug earlier so if do not even know when the bug came from
- your test coverage isn't good sufficient
- your tests are probably not actually testing what you believe they do
- your architecture is complex, too complex for you
To be clear though I do include myself in this abstract "you".
Wow and here I was doing this manually all these years.
git bisect gets interesting when API signatures change over a history - when this does happen, I find myself writing version-checking facades to invoke the "same" code in whatever way is legal
'git bisect run' is probably one of the most important software tools ever.
Binary searching your commit history and using version control software to automate the process just seems so...obvious?
I get that author learned a new-to-him technique and is excited to share with the world. But to this dev, with a rapidly greying beard, the article has the vibe of "Hey bro! You're not gonna believe this. But I just learned the Pope is catholic."
I’ve used bisect a couple of times but really it’s a workaround for having a poor process. Automatic unit tests, CI/CD, should have caught it first.
It’s still very satisfying to watch run though, especially if you write a script that it can run automatically (based on the existing code) to determine if it’s a good or bad commit.
> the OG tool `git`
This phrase immediately turned the rest of my hair gray. I'm old enough to still think of Git as the "new" version control system, having survived CVS and Subversion before it.
I hardly think binary search is an unknown algorithm even by beginner standards for someone from a completely different field
Surely everyone has a CI pipeline that wont allow merges with failing tests?
Git bisect was an extremely powerful tool when I worked in a big-ball-of-mud codebase that had no test coverage and terrible abstractions which made it impossible to write meaningful tests in the first place. In that codebase it was far easier to find a bug by finding the commit it was introduced in - simply because it was impossible to reason through the codebase otherwise.
In any high quality codebase I’ve worked in, git bisect has been totally unnecessary. It doesn’t matter which commit the bug was introduced in when it’s simple to test the components of your code in isolation and you have useful observability to instruct you on where to look and what use inputs to test with.
This has been my experience working on backend web services - YMMV wildly in different domains.