Declarative Program Analysis and Optimization

CS294-260, Spring 2024

babble: Learning Better Abstractions with E-Graphs and Anti-Unification

Wednesday, March 6, 2024
Discussion lead: Eric, Jacob

Reading guide due Monday, March 4 at 2:30pm

Paper link

Overview

This paper tackles Library Learning (finding common abstractions within a corpus of programs) with e-graphs. Because finding the smallest equivalent program is something e-graphs do easily (and evaluating good abstractions based on other methods is really difficult), using e-graphs for library learning reframes the problem as one of compression: finding abstractions which best reduce the total size of the given corpus plus the new library.

We go about this by:

First, we take our input corpus of programs and generate an e-graph and run equality saturation (as usual), using a set of user-provided equational rewrite rules.
Next (and first big idea of the paper), we generate candidate abstractions by running an anti-unification algorithm to find “the most concrete pattern that matches two given terms”. The insight this paper presents is how to apply this algorithm to e-classes instead of individual nodes in an AST.
Now that we’ve generated some candidate abstractions, we need to find the subset of them which reduces the size of the entire corpus. We do this by reframing the problem as an e-graph problem: we simply feed back in the candidate abstractions as rewrite rules to our e-graph, and then find the smallest term.

LLMT

Unlike traditional (purely syntactic) library learning, Babble explores semantically equivalent programs by taking a domain-specific equational theory as input; that is, the user specifies a set of rewrite rules for the library learning algorithm to explore. This allows Babble to search over a larger space of equivalent programs than traditional library learning tools, and consequently to discover previously-unreachable library functions. The authors call this approach library learning modulo theories (LLMT). Equality saturation is performed using the provided equational theory, producing an e-graph that compactly represents the large space of semantically equivalent programs.

Anti-Unification

The basic idea behind Anti-Unification is finding the “most concrete pattern that matches two given terms”. To do this, we walk the ASTs of each term until we find a mismatch, at which point we replace it with a variable (page 7 has a nice example). We keep track of these variable substitutions as a pair, so that if we encounter this pair later on in our traversal, we can replace it with the same variable.

The intuition here is that anti-unification will produce directly the kinds of patterns we’re interested in for library learning: ones which (1) will be used at least twice in our corpus (otherwise it will only add to the size of our library) and (2) we want abstractions which are as concrete as possible, in that they present the least number of variables. Consider if our corpus was $<{f(1,2), f(4,2)}>$ and we had two possible abstractions $f(X,2)$ and $f(X,Y)$, the first would be more concrete because it requires 1 variable instead of 2.

When we map this to e-classes, we unify two e-classes by following a similar top-down traversal, but this time as we’re replacing inconsistent terms in our traversal with variables which stand in for child e-classes, resulting in a set of anti-unification patterns. Running this algorithm naively, because of how big e-classes can get, will blow up our set of anti-unifiers (including when there are cycles in our e-classes).

To account for this we enforce that we only want dominating anti-unification patterns, which allows us to rule out a lot of our potential anti-unification set. A pattern p1 dominates another p2 if it matches the same e-classes and 1. p1 contains an inclusive subset of the variables of p2 (p1 is at least as concrete as p2) and 2. the size of p1 is at least as small as p2. If both of these are true, given our goal of reducing size, we know that we will never need p2 as a candidate pattern, so we can discard it. (This allows us to discard patterns built on following cycles in the e-classes).

To apply this to the e-graph as a whole, we apply this e-class anti-unification pairwise to e-classes in our graph, ignoring pairs of e-classes which we know can’t co-occur (this is detailed on page 18).

Candidate Extraction

Now that we’ve generated some candidate abstractions, we need to pick the optimal subset which best reduces the total size of the corpus (factoring in the size of the library). The problem with a naive approach is that, like applying rewrite rules one-by-one, it is difficult to reason about the effects a single substitution will have on the entire corpus and the other substitutions. This paper reframes this optimal selection problem as an e-graph extraction problem: we can introduce our candidate abstractions to the e-graph as rewrite rules, and then extract the smallest program out of the e-graph, and our optimal subset of abstractions would just be the abstractions we utilize in that extraction.

The problem we face is that when we calculate the smallest program, we don’t want to double count the size of our abstractions, since our goal here is to define abstractions which will be reused across the corpus. To account for this, we keep track of (1) the smallest size of the library and (2) the smallest size of the program refactored by our library, excluding any uses of our library. Then, by bottom-up computing these smallest sizes, we end up with the sizes of our corpus (with a candidate library applied) for each candidate library. To prevent this search from tracing all possible subsets of our candidate abstractions, the paper uses a beam search or a best-first search, where we limit the size of a candidate library and the number of candidate libraries we explore, picking the top-performing libraries to propogate up. We also rule out libraries for which we have other candidates which are (1) strict subsets and (2) lower cost.

Reading Order

We’d recommend starting with Section 1 to understand the problem and the paper’s framing, then reading Section 2 (Overview) in detail, which does a great job of explaining the intuition behind the main contributions. Sections 3,4, and 5 are close walkthroughs of the details. If you’re very comfortable reading this kind of literature, you might be able to briefly skim section 3 which motivates the two contribution sections: 4 (anti-unification on e-graphs) and 5 (candidate selection). The evaluation (section 6) was well-written but felt fairly standard and straightforward, and therefore less relevant to our class discussion.

Fig 9. on pg 16 was particularly helpful for understanding the concrete steps of the algorithm.

Discussion Questions

What is the role and importance of the equational theory input to Babble (the T in LLMT)? What makes for a “good” equational theory? This equational theory must be supplied by the user; how easy/difficult does it seem to provide a useful equational theory, and how might a user come up with one?
What about this domain makes it a promising application for equality saturation? Can you think of any downsides of using equality saturation in this case?
What does anti-unification of e-graphs accomplish, and what makes it useful for Babble? Can you think of any other applications for this strategy?

Helpful Materials

David’s POPL Slides are helpful for a high-level overview with some worked examples!

Notes

Of the related work, the DreamCoder paper was the most direct comparison - we’ve linked it here although the related work comparison in the paper is well-done.