The following is an edited (but not too edited) version of my end of
an email correspondence about IC3 [1] with PhD student Heidy
Khlaaf. Finding it illuminating, she recommended that I post it for
others to read. I hope you find it equally valuable. More formal
presentations of IC3 can be found in the original paper [1] and in
subsequent tutorials [2, 3], available on my website.
Heidy rightly observed that the essential thing to understand about
IC3 is the interplay of forward and backward reasoning, which gets to
the heart of what motivated IC3 in the first place.
Let me first recommend Zohar and Amir's excellent book, "Temporal
Verification of Reactive Systems: Safety." I've read it many times yet
still learn something new each time I reread it.
"Inductive generalization" is something that came out of their
emphasis on invariant generation. The dominant approach to invariant
generation is classic abstract interpretation: propagate in an
abstract domain (an over-approximation of) the system semantics to
convergence; see what you get. But the book describes a different
approach, a constraint-based method of "solving" for simple (but often
crucial) linear equation invariants. Michael Colon, with then-new PhD
student Sriram Sankaranarayanan, then proposed a method (CAV 2003) of
generating inductive linear inequalities via Farkas's Lemma, which
sparked quite a bit of subsequent research. In contrast to
polyhedral-based abstract interpretation, the method generates one
inductive inequality at a time. Sriram continued this line of
research, implementing a tool (StInG) that blasted out these
inequalities.
I started my PhD in 2003 and soon wondered why we care about
generating a whole bunch of these invariants. Why not just produce
relevant ones? Blind invariant generation is "forward" reasoning
("bottom-up" in the Manna-Pnueli book), whether implemented as an
explicit forward abstract interpretation or in a constraint-based
way. Backward reasoning ("top-down" in the book) attempts inductive
strengthening through examining (transitive) preimages of the
property. So my idea was to look at (transitive) predecessors (severe
under-approximations of preimages) to errors and augment the
constraint system for generating an inductive assertion with the
requirement that the generated assertion should rule out at least one
of the observed backward-reachable states. The second idea (again,
described in the book) was to generate invariants iteratively and use
previously-derived assertions to strengthen the current system (the
"incremental" method in the book, or what I called "relative
induction" by the time IC3 rolled around because the background
information was no longer itself necessarily inductive). It worked
pretty well for linear inequalities (which I eventually published on,
[4]) as well as in other contexts (which I didn't).
Then around 2005 I started looking at hardware. Would it be possible
to do the same thing there, i.e., to generate simple inductive
assertions in response to (and to eliminate) backward reachable
states, and to fashion from that simple mechanism a complete model
checker? Assertions of the form "a single clause over state
variables" seemed like a good starting point, as they are analogous to
inequalities and expressively complete when conjoined. I started with
the constraint-based perspective in mind: given backward reachable
state s, find inductive clause c such that
I => c and c \wedge T => c' and c => \neg s ,
i.e., c should be inductive and eliminate s. More: use previously
discovered information F,
F \wedge c \wedge T => c' ,
to increases the space of potential assertions c, as they only need to
be inductive relative to F.
Believe it or not, my first implementation used QBF, following the
spirit of "constraint-based" invariant generation with its
templates. The template C took the form of a template clause:
a_0 x_0 + a_1 (\neg x_0) + a_2 x_1 + a_3 (\neg x_1) + ...
where a_i are the template variables, to be replaced by constants 0 or
1, and x_i are the (Boolean) system variables. A solution to the
constraint problem (encoding the three conditions above) is an
inductive clause eliminating the state s. As you might expect, it
didn't work: QBF is just too slow (and still is).
More thought led to what I describe in [5]: an effective algorithm for
finding c. There are two insights: (1) the literals of c are a subset
of the negations of the literals of s, and (2) finding an inductive
subclause of \neg s is then just an abstract interpretation (although
disjunctive and therefore with many choices to make). That's mainly
what that paper is: a set of tricks for finding, in practice, an
inductive assertion (forward reasoning) in response to a seed state s
(backward reasoning) in the context of previously discovered
information F (relative induction).
It worked really well --- sometimes. The problem with the approach is
that if a simple inductive assertion cannot be found in response to a
state s, then s is disjoined to the target (target enlargement) and a
predecessor of s examined, which is just plain old explicit-state
enumeration. On some systems, explicit-state enumeration dominates and
kills performance (a bit of an understatement).
So I kept working, discarding many ideas along the way. Eventually I
realized that I shouldn't be so hard-headed about F and the assertions
that are generated. Instead of forcing them to be truly inductive (not
always possible), I could require only that they be inductive relative
to over-approximating "onion rings" (always possible), i.e., sets of
states over-approximating those reachable in one step, two steps,
etc. More work and time resulted in IC3, which can be viewed as having
three main mechanisms:
1. Over-approximating forward propagation: Given a set of clauses,
propagate them forward according to the semantics of the system as
far as they will go, forming the over-approximating i-step
reachability "onion rings."
2. But where do the clauses come from? CDCL ("conflict-driven clause
learning") if you will, but in the sequential sense. Look at the
outer onion ring (F_k). Can you in one step move from a state in
that ring to an error? If not, you have a good clause set, so
invoke mechanism (1). If so, you get a state (a CTI, or
"counterexample to induction"). Is that state reachable from the
penultimate onion ring? If yes, you have another state to
pursue. (The CTI "priority queue" of IC3 orchestrates the handling
of these states.) If not, then it's truly now a matter of CDCL: the
(unsat) SAT query provides an unsat core to that state, a
(SAT-level) "generalization" to add to the clause set.
3. But SAT-based generalization is not powerful enough (though pretty
good sometimes). The final mechanism is a form of generalization
that takes advantage of the sequential nature of the problem (MC vs
SAT). Rather than merely extracting a subcube of state s by
examining the unsat core of the unsat query
F_i \wedge T \wedge s' ,
look harder [5] for a subcube c of s such that
F_i \wedge T \wedge \neg c \wedge c'
is unsat. Induction provides a stronger generalization mechanism
than SAT alone.
The first mechanism uses forward reasoning in propagating a given set
of clauses; but backward reasoning shows up in that the clauses were
derived by examining backward reachable states. The second mechanism
starts from backward reasoning, but the query brings in forward
information F_i (an "onion ring," or over-approximating i-step
reachability set). The third mechanism, in its details, performs an
abstract interpretation in a domain established by a backward
reachable state and so also incorporates both directions of reasoning.
So, yes, a key aspect of IC3 is the dance between forward and backward
reasoning.
But notice that BMC, k-induction, and interpolation-based MC also
balance forward and backward reasoning. BMC and k-induction do it in
the SAT solver. The interpolation approach is forward for a given
abstract operator, but the abstract operator (interpolation relative
to the backward unrolling from the target) clearly takes backward
information into account. Moreover, the only refinement available is
backward: (backward) unroll more to create a stronger abstract
operator.
IC3 distinguishes itself in how it balances forward and backward
reasoning:
- it generalizes "conflict clauses" inductively, rather than only in
a CDCL manner as in the others, and it holds onto and reasons with
those clauses explicitly;
- it generates small bits of information (individual clauses) rather
than large interpolants in response to small bits of data
(individual backward-reachable states) rather than huge unsat
queries;
- its work is incremental (a gradual and steady accumulation of
information) rather than monolithic (big interpolants from big
queries; major refinements (unroll more) upon failure).
In short, it seems to balance forward and backward reasoning well.
More, it plays well with modern SAT solvers. (For example, it requires
only that the SAT solver provide "unsat assumptions," not full "unsat
cores," and it benefits from low-overhead incremental calls.)
Seen now only as IC3, the ideas can be technical and confusing.
(Complicating the matter, in the original paper [1], I allude to the
ideas of "deductive verification" (see, again, the Manna-Pnueli
perspective embodied in their book), whereas many readers developed
their intuition of formal methods from a more pure hardware model
checking background.)
Overall, then, IC3 is my synthesis of
- incremental invariant generation, both bottom-up and top-down (to
generate clauses);
- abstract interpretation (looking at the clause set as an abstract
domain);
- and classic MC ideas: forward and backward propagation
(historically exact with BDDs), "onion rings,"
which are all fundamental ideas in formal methods. I'd say that the
first idea (incremental invariant generation) is what others were
unaware of or at least did not consider pursuing as a core part of a
model checker.
[1] SAT-Based Model Checking without Unrolling, VMCAI 2011.
[2] With Fabio Somenzi, IC3: Where Monolithic and Incremental Meet,
FMCAD 2011.
[3] Understanding IC3, SAT 2012.
[4] With Zohar Manna, Property-Directed Incremental Invariant
Generation, Formal Aspects of Computing 2008.
[5] With Zohar Manna, Checking Safety by Inductive Generalization of
Counterexamples to Induction, FMCAD 2007.
Copyright 2014, Aaron R. Bradley