Patch terminology
Intended audience |
|
Origin |
|
Mood | Investigative. |
Methodology
Q: I am doing research for a source control project, can you answer what the following nouns mean to you in the context of source control, if anything? (Ordered alphabetically)
- a “change”
- a “commit”
- a “patch”
- a “revision”
Results
P1 (Google, uses Piper + CLs):
- Change: a difference in code. What isn’t committed yet.
- Commit: a cl. Code that is ready to push and has a description along with it
- Patch: a commit number that that someone made that may or may not be pushed yet […] A change that’s not yours
- Revision: a change to a commit?
P2 (big tech, uses GitLab + MRs):
- Change: added/removed/updated files
- Commit: a group of related changes with a description
- Patch: a textual representation of changes between two versions
- Revision: a version of the repository, like the state of all files after a set of commits
P3 (Google, uses Fig + CLs):
- Change: A change to me is any difference in code. Uncommitted to pushed. I’ve heard people say I’ve pushed the change.
- Commit: A commit is a saved code diff with a description.
- Patch: A patch is a diff between any two commits how to turn commit a into into b.
- Revision: Revisions idk. I think at work they are snapshots of a code base so all changes at a point in time.
P4 (Microsoft, uses GitHub + PRs):
- Change: the entire change I want to check into the codebase this can be multiple commits but it’s what I’m putting up for review
- Commit: a portion of my change
- Patch: a group of commits or a change I want to use on another repo/branch
- Revision: An id for a group of commits or a single commit
P5 (big tech, uses GitHub + PRs):
- Change: your update to source files in a repository
- Commit: description of change
- Patch: I don’t really use this but I would think a quick fix (image, imports, other small changes etc)
- Revision: some number or set of numbers corresponding to change
Remarks
Take-aways:
- Change: People largely don’t think of a “change” as an physical object, rather just a diff or abstract object.
- It can potentially range from uncommitted to committed to pushed (P1–P5).
- Unlike others, P4 thinks of it as a larger unit than a commit (more like a “review”), probably due to the GitHub PR workflow.
- Commit: Universally, commits are considered to have messages. However, the interpretation of a commit as a snapshot vs diff appears to be implicit (compare P2’s “commit” vs “revision”).
- Patch: Split between interpretations:
- Either it represents a diff between two versions of the code (P2, P3).
- Or it’s a higher-level interpretation of a patch as a transmissible change. Particularly for getting a change from someone else (P1), but can also refer to a change that you want to use on a different branch (P4).
- P5 merely considers a “patch” to be a “small fix”, which is also a generally accepted meaning, although a little imprecise in terms of source control (refers to the intention of the patch, rather than the mechanics of the patch itself).
- Revision: This is really interesting. The underlying mental models are very different, but the semantic implications end up aligning, more so than for the term “commit”!
- P1: Not a specific source control term, just “the effect of revising”.
- P2, P3: Effect of “applying all commits”. This implies that they consider “commits” as diffs and “revisions” as snapshots.
- P4, P5, Some notions that it’s specifically the identifier of a change/commit. It’s something that you can reference or send to others.
- Surprisingly to me, P2–P5 actually all essentially agree that “revision” means a snapshot of the codebase. The mental models are quite different (“accumulation of diffs” vs “stable identifier”) but they refer to the same ultimate result: a specific state of the codebase (…or a way to refer to it — what’s in a name?). This is essentially the opposite of “commit”, where everyone thinks that they agree on what they are, but they’re actually split — roughly evenly? — into snapshot vs diff mental models.
Conclusions
Conclusions for jj:
- We already knew that “change” is a difficult term, syntactically speaking. It’s also now apparent that it’s semantically unclear. Only P4 thought of it as a “reviewable unit”, which would probably most closely match the jj interpretation. We should switch away from this term.
- People are largely settled on what “commits” are in the ways that we thought.
- There are two main mental models, where participants appear to implicitly consider them to be either snapshots or diffs, as we know.
- They have to have messages according to participants (unlike in jj, where a commit/change may not yet have a message).
- It’s possible this is an artifact of the Git mental model, rather than fundamental. We don’t see a lot of confusion when we tell people “your commits can have empty messages”.
- I think the real implication is that the set of changes is packaged/finalized into one unit, as opposed to “changes”, which might be in flux or not properly packaged into one unit for publishing/sharing.
- Half of respondents think that “patch” primarily refers to a diff, while half think that it refers to a transmissible change.
- In my opinion, the “transmissible change” interpretation aligns most closely with jj changes at present. In particular, you put those up for review and people can download them.
- I also think the “diff” interpretation aligns with jj interpretation (as you can rebase patches around, and the semantic content of the patch doesn’t change); however, there is a great deal of discussion on Discord suggesting that people think of “patches” as immutable, and this doesn’t match the jj semantics where you can rebase them around (IIUC).
- Overall, I think “patch” is still the best term we have as a replacement for jj “changes” (unless somebody can propose a better one), and it’s clear that we should move away from “change” as a term.
- “Revision” is much more semantically clear than I thought it was. This means that we can adopt/coopt the existing term and ascribe the specific “snapshot” meaning that we do today.
- We already do use “revision” in many places, most notably “revsets”. For consistency, we likely want to standardize “revision” instead of “commit” as a term.
Related posts
The following are hand-curated posts which you might find interesting.
Date | Title | |
---|---|---|
19 Jun 2021 | git undo: We can do better | |
12 Oct 2021 | Lightning-fast rebases with git-move | |
19 Oct 2022 | Build-aware sparse checkouts | |
16 Nov 2022 | Bringing revsets to Git | |
05 Jan 2023 | Where are my Git UI features from the future? | |
11 Jan 2024 | (this post) | Patch terminology |
Want to see more of my posts? Follow me on Twitter or subscribe via RSS.