Bringing revsets to Git
Intended audience |
|
Origin |
|
Revsets are a declarative language from the Mercurial version control system. Most commands in Mercurial that accept a commit can instead accept a revset expression to specify one or more commits meeting certain criteria. The git-branchless suite of tools introduces its own revset language which can be used with Git.
- Try it out
- Existing Git syntax
- Better scripting
- Better graph view
- Better rebasing
- Better testing
- Prior work
- Related posts
- Comments
Try it out
To try out revsets, install git-branchless, or see Prior work for alternatives.
Sapling SCM
While this post was still in draft, Sapling SCM was announced, which git-branchless is descended from in spirit. It’s discussed in Prior work as well.
Existing Git syntax
Git already supports its own revision specification language (see gitrevisions(7)
). You may have already written e.g. HEAD~
to mean the immediate parent of HEAD
.
However, Git’s revision specification language doesn’t integrate well with the rest of Git. You can write git log foo..bar
to list the commits between foo
and bar
, but you can’t write git rebase foo..bar
to rebase that same range of commits.
It can also be difficult to express certain sets of commits:
- You can only express contiguous ranges of the commits, not arbitrary sets.
- You can’t directly query for the children of a given commit.
git-branchless introduces a revset language which can be used directly via its git query
or with its other commands, such as git smartlog
, git move
, and git test
.
The rest of this article shows a few things you can do with revsets. You can also read the Revset recipes thread on the git-branchless discussion board.
Better scripting
Revsets can compose to form complex queries in ways that Git can’t express natively.
In git log
, you could write this to filter commits by a certain author:
$ git log --author="Foo"
But negating this pattern is quite difficult; see Stack Overflow question equivalence of: git log –exclude-author?.
With revsets, the same search can be straightforwardly negated with not
:
$ git query 'not(author.name(Foo))'
It’s easy to add more filters to refine your query. To additionally limit to files which match a certain pattern and commit messages which contain a certain string, you could write this:
$ git query 'not(author.name(Foo)) & paths.changed(path/to/file) & message(Ticket-123)'
You can express complicated ad-hoc queries in this way without having to write a custom script.
Better graph view
Git has a graph view available with git log --graph
, which is a useful way to orient yourself in the commit graph. However, it’s somewhat limited in what it can render. There’s no way to filter commits to only those matching a certain condition.
git-branchless offers a “smartlog” command which attempts to show you only relevant commits. By default, it includes all of your local work up until the main branch, but not other people’s commits. Mine looks like this right now:
But you can also filter commits using revsets. To show only my draft work which touches the git-branchless-lib/src/git
directory, I can issue this command:
Another common use-case might be to render the relative topology of branches in just this stack:
You can also render commits which have already been checked into the main branch, if so desired.
Better rebasing
Not only can you render the commit graph with revsets, but you can also modify it. Revsets are quite useful when used with “patch-stack” workflows, such as those used for the Git and Linux projects, or at certain tech companies practicing trunk-based development.
For example, suppose you have some refactoring changes to the file foo
on your current branch, and you want to separate them into a new branch for review:
You can use revsets to select just the commits touching foo
in the current branch:
Then use git move
to pull them out:
$ git move --exact 'stack() & paths.changed(foo)' --dest 'main'
If you want to reorder the commits so that they’re at the base of the current branch, you can just add --insert
:
$ git move --exact 'stack() & paths.changed(foo)' --dest 'main' --insert
Of course, you can use a number of different predicates to specify the commits to move. See the full revset reference.
Better testing
You can use revsets with git-branchless’s git test
command to help you run (or re-run) tests on various commits. For example, to run pytest
on all of your branches in parallel and cache the results, you can run:
$ git test run --exec 'pytest' --jobs 4 'branches()'
You can also use revsets to aid the investigation of a bug with git test
. If you know that a bug was introduced between commits A and B, and has to be in a commit touching file foo
, then you can use git test
like this to find the first commit which introduced the bug:
$ git test run --exec 'cargo test' 'A:B & paths.changed(foo)'
This can be an easy way to skip commits which you know aren’t relevant to the change.
Versus git bisect
You can use git bisect
and filter by paths, of course, but it may be more tedious. Note that, unlike git bisect
, git test
currently conducts a linear search, so it’s not the best choice for all cases. This will hopefully change in the future.
git test
has several features which git bisect
doesn’t offer, such as parallel testing, out-of-tree testing, and caching of test results.
Caching test results
git test
will cache the results of the test command, so if you decide to expand the search set later, you don’t have to re-run the test command on commits you’ve already tested.
Prior work
This isn’t the first introduction of revsets to version control. Prior work:
- Of course, Mercurial itself introduced revsets. See the documentation here: https://www.mercurial-scm.org/repo/hg/help/revsets
- https://github.com/quark-zju/gitrevset: the immediate predecessor of this work. git-branchless uses the same back-end “segmented changelog” library (from Sapling SCM, then called Eden SCM) to manage the commit graph. The advantage of using revsets with git-branchless is that it integrates with several other commands in the git-branchless suite of tools.
- https://sapling-scm.com/: also an immediate predecessor of this work, as it originally published the segmented changelog library which
gitrevset
and git-branchless use. git-branchless was inspired by Sapling’s design, and has similar but non-overlapping functionality. See https://github.com/arxanas/git-branchless/discussions/654 for more details. - https://github.com/martinvonz/jj: Jujutsu is a Git-compatible VCS which also offers revsets. git-branchless and jj have similar but non-overlapping functionality. It’s worth checking out if you want to use a more principled version control system but still seamlessly interoperate with Git repositories. I expect git-branchless’s unique features to make their way into Jujutsu over time.
Related posts
The following are hand-curated posts which you might find interesting.
Date | Title | |
---|---|---|
19 Jun 2021 | git undo: We can do better | |
12 Oct 2021 | Lightning-fast rebases with git-move | |
19 Oct 2022 | Build-aware sparse checkouts | |
16 Nov 2022 | (this post) | Bringing revsets to Git |
05 Jan 2023 | Where are my Git UI features from the future? | |
11 Jan 2024 | Patch terminology |
Want to see more of my posts? Follow me on Twitter or subscribe via RSS.