$ cd /engineering && cat ./proving-equivalence-with-egraphs.md
Formal MethodsEngineering
> Proving Equivalence with E-Graphs
Robert Chiniquy
||5 min read
share:
width:
Two permission derivations can look completely different and compute the same access. An RBAC system that grants “admin” and a policy system that allows all actions on all resources might produce identical effective permissions. The e-graph discovers these equivalences automatically.
This is part of a series on formally verifying identity connectors. The verification framework checks each connector’s output against a Datalog specification. But when a connector produces a fact that’s structurally different from what the axioms predict, is it wrong – or is it an equivalent derivation? The e-graph answers that question.
The Bridge
The implementation connects the term store to an e-graph:
type EGraphBridge struct {
eg *egraph.EGraph
termToID map[TermID]egraph.EClassID
idToTerm map[egraph.EClassID]TermID
store *TermStore
rules []*egraph.RewriteRule
}
Terms from the predicate system get imported into the e-graph, where equality saturation discovers which ones are equivalent. Three operations matter:
Saturate runs rewrite rules until no new equivalences are found – or 30 iterations, whichever comes first. Each iteration merges equivalent nodes and rebuilds the graph.
Simplify imports a term, saturates, and extracts the simplest equivalent form. “Simplest” is defined by a cost heuristic.
Equivalent checks whether two terms ended up in the same equivalence class after saturation.
Cost
The cost heuristic biases extraction toward simpler representations:
func nodeCost(n egraph.ENode) int {
switch n.Op.Name() {
case "True", "False":
return 0
}
if len(n.Children) == 0 {
return 1
}
switch n.Op.Name() {
case "And", "Or":
return 10 + len(n.Children)*5
case "Not":
return 5
default:
return 3 + len(n.Children)
}
}
Constants are free. Leaf predicates cost 1. Boolean operators are expensive – And and Or cost 10 plus 5 per child. The e-graph always prefers a single predicate over a conjunction, and prefers True or False over any expression that reduces to them. After saturation, extracting the minimum-cost representative from each equivalence class produces the simplest form of every permission derivation.
Boolean Algebra Rules
The first layer of rewrite rules covers standard Boolean algebra:
Rule
Effect
And(True, x) → x
Identity
And(False, x) → False
Annihilator
Not(Not(x)) → x
Double negation
And(x, Not(x)) → False
Contradiction
Or(x, Not(x)) → True
Tautology
Not(And(x,y)) → Or(Not(x), Not(y))
De Morgan
Not(Or(x,y)) → And(Not(x), Not(y))
De Morgan
And(x, Or(x,y)) → x
Absorption
And(x, x) → x
Idempotence
These are textbook. But the e-graph applies them exhaustively. A complex nested expression like And(Or(x, And(x, y)), Not(Not(x))) saturates to just x through absorption, double negation, and idempotence. The rules don’t need to be applied in the right order – equality saturation explores all orderings simultaneously.
That’s the point of e-graphs. Traditional rewrite systems are sensitive to rule ordering: applying rule A before rule B might miss a simplification that B-then-A would find. E-graphs add both results to the graph and let saturation discover the optimal path. Philip Zucker’s posts on e-graphs as ground completion and his simplified implementation are the clearest bridge between the theory and what this code actually does. I read them and thought: I can do this. That’s the feeling all writing about engineering should provide.
Domain-Specific Rules
The second layer is where authorization semantics enter the picture.
String equality conflict:
StringEquals(k, v1) AND StringEquals(k, v2) → False when v1 ≠ v2
A policy requiring the principal to be both “alice” and “bob” simultaneously is unsatisfiable.
CIDR intersection:
IpAddress(k, cidr1) AND IpAddress(k, cidr2) → IpAddress(k, intersect(cidr1, cidr2))
The intersection computation handles four relationships:
type CIDRRelation int
const (
CIDRDisjoint // 10.0.0.0/8 vs 172.16.0.0/12 → False
CIDREqual // Same range → either one
CIDRSubset // 10.0.1.0/24 ⊂ 10.0.0.0/16 → the /24
CIDRSuperset // Inverse → the more specific range
CIDROverlap // Partial → computed intersection
)
Two conditions requiring requests from disjoint CIDRs produce False. A broad range intersected with a narrow range produces the narrow range. S3 prefix subsumption works the same way – s3:prefix/a/* intersected with s3:prefix/a/b/* is the more specific prefix.
Numeric range conflict:
NumericLessThan(k, v1) AND NumericGreaterThan(k, v2) → False when v1 ≤ v2
These domain-specific rules compose with the Boolean algebra rules through saturation. A complex policy condition might first simplify through De Morgan’s laws, then discover a CIDR conflict in the result, which collapses a sub-expression to False, which propagates through the annihilator rule to simplify the parent. The e-graph handles all of this without explicit orchestration.
What This Catches
The e-graph is how the verification framework handles Class C (Policy Evaluation) systems – AWS IAM, GCP IAM, Cloudflare. These systems compute permissions through allow/deny policy evaluation with conditions, wildcards, and precedence rules. Two policies that look completely different can grant identical effective access.
When the framework finds a discrepancy between a connector’s output and the Datalog specification, it passes both derivations through the e-graph before reporting a failure. If they end up in the same equivalence class, the discrepancy is a difference in representation, not a difference in meaning. The connector is correct; it just expressed the same permission differently.
Nelson and Oppen proved congruence closure works in 1980. Max Willsey’s PhD thesis made equality saturation practical by deferring invariant restoration – instead of rebuilding the e-graph after every merge, batch the repairs and do them lazily. Asymptotically better, and concretely the difference between “too slow for real workloads” and “fine.” Kroening and Strichman’s Decision Procedures has the clearest modern treatment of why it all works.
Series
This is part of a series on formally verifying identity connectors: