Lithe, less analysis with Datalog
Intended audience |
|
Origin |
|
Mood |
|
Table of contents:
Soufflé
I recently experimented with Soufflé Datalog for a certain work problem: comparing CI jobs’ declared dependencies with dependencies inferred via abstract interpretation (which is worth a separate post).
Recently, I saw the blog post Linear scan register allocation on SSA and noted that there was quite a lot of imperative implementation code, to the point where it might obstruct experimentating with the underlying ideas.
I suggested that the author could use Datalog instead to simplify the analyses, and we worked through the example of rewriting the liveness analysis in Soufflé, resulting in the blog post Liveness analysis with Datalog.
Rapid prototyping
As per its homepage, Soufflé promises to enable “rapid prototyping”:
Rapid-prototyping for your analysis problems with logic; enabling deep design-space explorations; designed for large-scale static analysis; e.g., points-to analysis for Java, taint-analysis, security checks.
And from my recent experience, I found that it delivered!
For example: When I wanted to implement the rule that, in Bazel, the target
foo/...
expands to all targets under foo
, I just had to add one rule:
BazelLDep(repo, package, "...", $BazelTarget(repo, package, target)) :-
BazelLDep(repo, package, target, _),
target != "...".
It can be read as:
A target of the form
<package>/...
has a dependency on every target of the form<package>/<target>
, as long as that target is not...
.
And then this rule automatically combines with all the other rules about dependencies that I’d written in the analysis.
Who originally created the nodes of the form foo/...
? Where are all the places that might consume them? I don’t know! I don’t care! I just let the Datalog engine figure it out.
The curse of brevity
In that blog post, the demonstration seems more blithe than lithe, because we end up showcasing basically two lines of Datalog:
live_out(b, v) :- block_succ(s, b), live_in(s, v).
live_in(b, v) :- (live_out(b, v) ; block_use(b, v)), !block_def(b, v).
Not even that: Our original formulation used only a single rule! We split it into two rules for clarity.
So what’s the big deal? Well, it’s an order-of-magnitude reduction compared to the original code!
def analyze_liveness
order = post_order
gen, kill = compute_initial_liveness_sets(order)
live_in = Hash.new 0
changed = true
while changed
changed = false
for block in order
block_live = block.successors.map { |succ| live_in[succ] }.reduce(0, :|)
block_live |= gen[block]
block_live &= ~kill[block]
if live_in[block] != block_live
changed = true
live_in[block] = block_live
end
end
end
live_in
end
The Soufflé examples page has the same problem. They look trivial. Does this look like a substantial program to you?
.decl alias( a:var, b:var )
.output alias
alias(X,X) :- assign(X,_).
alias(X,X) :- assign(_,X).
alias(X,Y) :- assign(X,Y).
alias(X,Y) :- ld(X,A,F), alias(A,B), st(B,F,Y).
.decl pointsTo( a:var, o:obj )
.output pointsTo
pointsTo(X,Y) :- new(X,Y).
pointsTo(X,Y) :- alias(X,Z), pointsTo(Z,Y).
But when I first read that page, I was blown away — only because I already knew how much code it would take in a traditional programming language.
Related posts
The following are hand-curated posts which you might find interesting.
Date | Title | |
---|---|---|
24 Jul 2017 | Null-tracking, or the difference between union and sum types | |
23 Dec 2017 | Why LINQ syntax differs from SQL, list comprehensions, etc. | |
20 Apr 2020 | Monotonicity is a halfway point between mutability and immutability | |
14 Jun 2022 | Functions are constants too | |
17 Jun 2023 | Encoding ML-style modules in Rust | |
14 Aug 2025 | (this post) | Lithe, less analysis with Datalog |
Want to see more of my posts? Follow me on Twitter or subscribe via RSS.