Intended audience
  • Static analysis enthusiasts
  • Logic programming enthusiasts
Origin
Mood
  • Liberated

Soufflé

I recently experimented with Soufflé Datalog for a certain work problem: comparing CI jobs’ declared dependencies with dependencies inferred via abstract interpretation (which is worth a separate post).

Recently, I saw the blog post Linear scan register allocation on SSA and noted that there was quite a lot of imperative implementation code, to the point where it might obstruct experimentating with the underlying ideas.

I suggested that the author could use Datalog instead to simplify the analyses, and we worked through the example of rewriting the liveness analysis in Soufflé, resulting in the blog post Liveness analysis with Datalog.

Rapid prototyping

As per its homepage, Soufflé promises to enable “rapid prototyping”:

Rapid-prototyping for your analysis problems with logic; enabling deep design-space explorations; designed for large-scale static analysis; e.g., points-to analysis for Java, taint-analysis, security checks.

And from my recent experience, I found that it delivered!

For example: When I wanted to implement the rule that, in Bazel, the target foo/... expands to all targets under foo, I just had to add one rule:

BazelLDep(repo, package, "...", $BazelTarget(repo, package, target)) :-
  BazelLDep(repo, package, target, _),
  target != "...".

It can be read as:

A target of the form <package>/... has a dependency on every target of the form <package>/<target>, as long as that target is not ....

And then this rule automatically combines with all the other rules about dependencies that I’d written in the analysis.

Who originally created the nodes of the form foo/...? Where are all the places that might consume them? I don’t know! I don’t care! I just let the Datalog engine figure it out.

The curse of brevity

In that blog post, the demonstration seems more blithe than lithe, because we end up showcasing basically two lines of Datalog:

live_out(b, v) :- block_succ(s, b), live_in(s, v).
live_in(b, v) :- (live_out(b, v) ; block_use(b, v)), !block_def(b, v).

Not even that: Our original formulation used only a single rule! We split it into two rules for clarity.

So what’s the big deal? Well, it’s an order-of-magnitude reduction compared to the original code!

def analyze_liveness
  order = post_order
  gen, kill = compute_initial_liveness_sets(order)
  live_in = Hash.new 0
  changed = true
  while changed
    changed = false
    for block in order
      block_live = block.successors.map { |succ| live_in[succ] }.reduce(0, :|)
      block_live |= gen[block]
      block_live &= ~kill[block]
      if live_in[block] != block_live
        changed = true
        live_in[block] = block_live
      end
    end
  end
  live_in
end

The Soufflé examples page has the same problem. They look trivial. Does this look like a substantial program to you?

.decl alias( a:var, b:var )
.output alias
alias(X,X) :- assign(X,_).
alias(X,X) :- assign(_,X).
alias(X,Y) :- assign(X,Y).
alias(X,Y) :- ld(X,A,F), alias(A,B), st(B,F,Y).

.decl pointsTo( a:var, o:obj )
.output pointsTo
pointsTo(X,Y) :- new(X,Y).
pointsTo(X,Y) :- alias(X,Z), pointsTo(Z,Y).

But when I first read that page, I was blown away — only because I already knew how much code it would take in a traditional programming language.

The following are hand-curated posts which you might find interesting.

Want to see more of my posts? Follow me on Twitter or subscribe via RSS.

Comments