Trusted Binary Decision Diagrams (Proof Generation) #435

SSoelvsten · 2022-12-19T12:08:34Z

A problem in SAT solving is trusting the solver when it claims a formula to be unsatisfiable. To this end, the solver can create an LRAT (or just a DRAT) proof. But, doing so is not a trivial task. Yet, BDDs naturally give rise to these very proofs, as is shown in [TACAS 21, arXiv 21] . The underlying idea is to store within each node the associated extension variable u and the four proof steps that encode the formula 'u <-> ITE(x,u₁,u₀)'.

This was added to a fork of the BuDDy BDD library, called TBuDDy. We want to add the same features to Adiar. The files tbdd.h and tbdd.cpp of TBuDDy might be useful reference material.

API

Proof Output

In TBuDDy, the user specifies the desired proof format and output file as part of a tbdd_init function (which also initialises BuDDy itself). We can do the same. But, I would prefer if we somehow could support multiple output files at once (thinking of later concurrent/thread-safe usages of Adiar). For thread-safety we might want to take a note or two from the exec_policy design (#548)

Add tbdd_ostream class to handle outputting the proof steps (LRAT or DRAT), create fresh variables, and create the clause identifiers.
- Each basic TBDD creation (tbdd_clause and tbdd_validate below) has the output stream as its last argument (default: global proof_writer).
- For thread-safety the tbdd_ostream in question needs to have a write-lock (using RAII).

TBDD Operations

The TBDD u for the input CNF clause with the given id.
```
tbdd tbdd_clause(const generator<int> &xs, const size_t clause_id);
```
This is similar to bdd_or ( with Extend bdd_and and bdd_or with negation flags #556 ) and adds the associated proof of C_id |= u. In TBuDDy this is the tbdd_create(bdd f, clause_id i) function. That is, we may consider to just use bdd_and(xs) and then convert it into a tbdd afterwards.

This may need an (optional?) last argument for a shared proof_writer.
The TBDD w for u∧v with the associated proof that u∧v → w.
```
tbdd tbdd_and(tbdd u, tbdd v);
```
Also add & and &= overloads
Upgrades f to a TBDD u' by proving that u → u'.
```
tbdd tbdd_validate(tbdd u, bdd f);
```
This may need an (optional?) last argument for a shared proof_writer.

TBDD Information

Next to tbdd_clause(xs, id) to create a clause, we may also want the following basic building blocks:

The tautological clause (reuse bdd_top)
```
bool tbdd_top();
```
This may need an (optional?) last argument for a shared proof_writer.

We should also bring over the following predicates and accessors from bdd ( together with #536 ).

Whether a TBDD is a tautology (reuse bdd_istrue)
```
bool tbdd_istop(tbdd u);
```
Whether a TBDD is a contradiction (reuse bdd_isfalse)
```
bool tbdd_isbot(tbdd u);
```
Whether a TBDD is a single literal (reuse bdd_isvar)
```
bool tbdd_isliteral(tbdd u);
```
Whether a TBDD is a positive literal (reuse bdd_isithvar)
```
bool tbdd_ispositive(tbdd u);
```
Whether a TBDD is a negative literal (reuse bdd_isnithvar)
```
bool tbdd_isnegative(tbdd u);
```

Finally, for debugging and statistics we also need

The number of BDD nodes in the TBDD (reuse bdd_nodecount)
```
size_t tbdd_nodecount(tbdd u);
```
The number of Literals (i.e. the number of levels) present in the TBDD (reuse bdd_varcount)
```
size_t tbdd_varcount(tbdd u);
```
Print the underlying BDD to a file (reuse bdd_printdot but maybe the output should also include the proof variables? If so, then this may not be trivial)
```
void tbdd_printdot(tbdd u, std::ostream);
```

Other TBDD Operations (not mandatory)

In [CADE 2021], more operations are described. Here, they use the proof system to both proof a formula to be true and to be false (since that is necessary for QBF problems). What we have noted above for tbdd_and(u, v) is only the truth preserving version. See [CADE 21] for more details.

The TBDD w for u∨v with the associated (thruth preserving) proof that u → w and v → w.
```
tbdd tbdd_or(tbdd u, tbdd v);
```
Also add | and |= overloads
The TBDD w for u∧l where l is a single literal. This creates the associated (truth preserving) proof of w∧l → u.
```
tbdd tbdd_restrict(tbdd u, int l);
```
The TBDD w for ∃x : u.
```
 tbdd tbdd_exists(tbdd u, int x);
```
In [CADE 21] this is just the combination of tbdd_restrict and tbdd_or. Yet, one should be able to (1) combine it into a single quantification sweep for each literal and then maybe (2) incorporate multiple literals at once.
The TBDD w for ∀x : u.
```
tbdd tbdd_forall(tbdd u, int x);
```
Use tbdd_restrict and tbdd_and together. Similar to tbdd_exists, future work would be to improve performance with a dedicated quantification algorithm.

`tbdd_and` (Manipulation Algorithms)

I'll focus on tbdd_and for now...

Since the proof steps are built bottom-up most of the work happens in the Reduce Sweep - the preceding top-down sweep only needs to be extended such that it also stores the relevant association between nodes and proof variables.

Top-Down Sweeps

During the top-down sweep, forward a mapping between the nodes of the unreduced intermediate output and the original nodes (and/or their variables).

Algorithmic Extension

Make the prod2_policy a stateful policy (similar to nested_sweep).
Add hooks to prod2:
- Start processing of pair (u,v). Here, the policy knows how far to go forward in its list of proof clauses.
- Forwarding (u,v) across a level. Here, the policy knows to forward proof clauses too in its own priority queue. Yet, unlike the product construction it only needs to forward information for the node and not all its ingoing arcs. Hence, the maximum size of this priority queue is the product of the inputs' widths.
- Resolved (u,v) to w. Here, the policy can output all of the relevant proof clauses for later.

Reduce Sweep

Preliminaries

The end-goal is to describe each BDD node u = (x, u[0], u[1]) as the following four clauses that collectively encode u↔(x ? u[1] : u[0]).

Notation	Clause
HD(u)	-x -u u[1]
LD(u)	x -u u[0]
HU(u)	-x -u[1] u
LU(u)	x -u[0] u

Here, the L and H respectively are abbreviations for Low and High edges. The U is Up and D is Down (with respect to the direction of the implication). Notice, this perfectly fits with the information flow of the Reduce algorithm.

After having defined the BDD node in the proof, we also want to prove that it is the correct result. To this end, we need to combine the defining clauses of original BDD nodes u, v, and the resulting BDD node w.

BDD Node	Clauses
u	HD(u), LD(u)
v	HD(v), LD(v)
w	HU(w), LU(w)

Next to these six clauses we also need the following clauses that claim they correspond to the and of u and v.

Notation	Clause
ANDH	-u[1] -v[1] w[1]
ANDL	-u[0] -v[0] w[0]

Knowing the clause id of all eight of these, we can combine them into two RUP resolution chains that can be combined to proof -u -v w, i.e. u∧v→w.

Algorithmic Extension

Again, we should be able to implement the above by the addition of hooks to a stateful reduce_policy. Alternatively, if this becomes too messy/expensive, then we may want to just implement a separate tbdd_reduce. The four places the algorithm needs to be extended is:

Node w is suppressed:
Store this down together, similar to the red1_mapping.
Outputting new node w as w':
Push LD(w'), LU(w'), HD(w'), and HU(w') to the proof and push their ids to the clause id file. Furthermore, push the clause id into a red2_mapping sorter that puts it back in order with w. Here, we
Outputting duplicate node w as w':
Also place this mapping in the red2_mapping clause sorter.

With all of the defining clauses of w' available, we now only need to combine them with the input's clauses. Yet, to do so it needs to back in the order of the unreduced product graph. Hence, we need the clause variants of red1_mapping and red2_mapping. Finally, knowing all of the relevant clauses and having identified the proof case in question, we can output the proofs during the last hook:

Forwarding w' to parents of w
Here, we need to provide the identifiers for the defining and correctness proving clauses of w'. These then need to be forwarded to their parents.

For all of this, the priority queue needs to be extended with w[_] and the clause proving their correctness (this is akin to #412).

`tbdd_validate` (Validation Algorithm)

This follows the above described idea for tbdd_and

In a top-down sweep, create the clause mapping of the product graph between u and f (based on the implication operator).
In the bottom-up sweep, instead of ANDL and ANDH create the relevant correctness proving clauses.

To make this faster, we would want the proof to relate to the BDD nodes of f rather than to new TBDD nodes of u' (since we then can reuse the node file for f). To do so, we should create a custom tbdd_reduce which does not output any new nodes and skips all of the computations to apply reduction rules. Instead, it traverses the transposed version of f, declares new variables for u', and proves it correct with the clauses of u.

Sort the products graph clauses of step 1. to match the transposition of f.
Forward relevant clauses bottom-up with a (levelized) priority queue.
For each node of f, combine the clauses from the priority queue to create new extension clauses for the nodes of f and use the product graph clauses to prove the implication.

Architecture

tbdd class
- Conversion from tbdd to bdd should be trivial (since we may want to do some BDD operations, e.g. bdd_exists, and then get back to TBDDs with tbdd_validate). That is, it should merely be forgetting the associated variables. Hence, extension variables should not be included in the node or levelized_file<node> class but be placed in a separate file with the associated proof steps. In fact, we would want tbdd to inherit from the bdd class.
- To keep the number of active clauses low, all freed TBDD nodes need to be deleted in the proof. Since Adiar does not share subtrees, then this merely is to make the tbdd destructor output a delete of all its extension variables. Hence, each tbdd needs to carry around a reference to their proof writer (assuming there is not only one).
  - Since deletions should not happen before the final result is computed, we do not need a __tbdd class (at least in the public API). To safe a little bit of disk space, we can also move the clause file(s) from the input into __tbdd.
  - The proof consists of (a) four RAT clauses that define each BDD node and (b) three RUP steps to prove correctness. We need the defining clauses of (a) later to prove the correctness of the next algorithm. The clauses of (b) need to be stored to delete them later.
tbdd_reduce

Add the hooks and logic described above.
tbdd_and

Add the hooks and logic described above.
tbdd_validate

Implement tbdd_validate__down and tbdd_validate__up as described above.

References

[TACAS 21] Randal E. Bryant and Marijn J. H. Heule. “Generating extended resolution proofs with a BDD-based SAT solver”. In: Tools and Algorithms for the Construction and Analysis of Systems (2021)
[arXiv 21] Randal E. Bryant and Marijn J. H. Heule. “Generating extended resolution proofs with a BDD-based SAT solver (extended version)”. In: arXiv Preprint (2021)
[CADE 21] Randal E. Bryant and Marijn J. H. Heule. “Dual Proof Generation for Quantified Boolean Formulas with a BDD-based Solver”. In: International Conference on Automated Deduction (2021)

The text was updated successfully, but these errors were encountered:

SSoelvsten · 2023-11-12T19:35:43Z

Variable Sharing → Proof Compression

The implementation described above makes no use of node sharing in any way at all. In parts, the proof has to retraverse the entire output anyway, since it has to create the correctness clauses (all the way from the bottom). Yet, if we can share proof variables between TBDDs, then we can mitigate some heavy blowup in the proof size.

Yet, solving this seems quite difficult. Since we cannot know anything about the lifetimes of a TBDD, we cannot merely transfer the ownership from one TBDD to another, since the latter may be destructed before the former.

SSoelvsten · 2023-11-30T13:50:39Z

While reading [FMCAD 22] on the TBuDDy implementation, it does not seem like the memory demands of the tbdds is the hurdle to overcome:

The experiments do not show TBuDDy runs out of memory but should partially begin to use 64-bit integers rather than 32-bit integers:
- In 900 seconds, it was able to create almost 1 billion clauses. This is half of what fits into an int which is what TBuDDy now uses to store down the indices. Hence, the BDD package runs out of LRAT proof clauses before it runs out of BDD nodes.
- Some of the benchmarks had the BDD package run out of BDD variables (BuDDy only supports 2^21> variables) rather than actual BDD nodes. This is related to Change nodetable and cache indexing from int to ssize_t buddy#3 .
The LRAT-CHECK tool has been modified to decrease its memory usage when the set of active clauses are sparse. The way it does it is with a two-level indexation scheme: an array holds pointers to blocks of consecutive clauses. When all clauses in a block are deleted, then the block can be freed.
- Each BDD node occupies 4 clauses, each of which consists of up to 3 variables. Since BuDDy only uses 32-bit integers, then drat-trim should be compiled for using the same too (LONGTYPE in drat-trim/lrat-check.c). This means, each BDD node requires 48 bytes of clause literals; each BDD node requires 24 bytes in BuDDy, meaning the memory usage is (roughly speaking) at most 2:1.
- One may want to investigate alternative data structures to hold onto the clauses inside of the proof checker; data structures that are better designed for sparse data.
- An I/O-efficient LRAT proof checker would be the catch-all solution.

References

[FMCAD 22] Randal E. Bryant. “TBuDDy: A Proof-Generateng BDD Package”. In: Formal Methods in Computer-Aided Design (2022)
[TBuDDy] https://github.com/rebryant/tbuddy-artifact
[drat-trim] https://github.com/marijnheule/drat-trim

SSoelvsten added ✨ feature New operation or other feature 🎓 student project Work, work... but academic! 🎓 student project: MSc labels Dec 19, 2022

SSoelvsten added this to the v1.x : Other Singly-Recursive Features milestone Dec 19, 2022

SSoelvsten mentioned this issue Mar 25, 2023

Move reduce taint flag out of e.target() #412

Open

3 tasks

SSoelvsten mentioned this issue Apr 6, 2023

Move Reduce algorithm flags out of the arc and into an 8-bit uint #479

Closed

SSoelvsten removed the 🎓 MSc project label Sep 11, 2023

SSoelvsten changed the title ~~Generating Extended Resolution Proofs within Adiar~~ tbdd : Generating Extended Resolution Proofs within Adiar Nov 12, 2023

SSoelvsten changed the title ~~tbdd : Generating Extended Resolution Proofs within Adiar~~ Trusted Binary Decision Diagrams (LRAT Proof Generation) Nov 12, 2023

SSoelvsten changed the title ~~Trusted Binary Decision Diagrams (LRAT Proof Generation)~~ Trusted Binary Decision Diagrams (Proof Generation) Nov 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trusted Binary Decision Diagrams (Proof Generation) #435

Trusted Binary Decision Diagrams (Proof Generation) #435

SSoelvsten commented Dec 19, 2022 •

edited

Loading

SSoelvsten commented Nov 12, 2023 •

edited

Loading

SSoelvsten commented Nov 30, 2023 •

edited

Loading

Trusted Binary Decision Diagrams (Proof Generation) #435

Trusted Binary Decision Diagrams (Proof Generation) #435

Comments

SSoelvsten commented Dec 19, 2022 • edited Loading

API

Proof Output

TBDD Operations

TBDD Information

Other TBDD Operations (not mandatory)

tbdd_and (Manipulation Algorithms)

Top-Down Sweeps

Algorithmic Extension

Reduce Sweep

Preliminaries

Algorithmic Extension

tbdd_validate (Validation Algorithm)

Architecture

References

SSoelvsten commented Nov 12, 2023 • edited Loading

Variable Sharing → Proof Compression

SSoelvsten commented Nov 30, 2023 • edited Loading

References

SSoelvsten commented Dec 19, 2022 •

edited

Loading

`tbdd_and` (Manipulation Algorithms)

`tbdd_validate` (Validation Algorithm)

SSoelvsten commented Nov 12, 2023 •

edited

Loading

SSoelvsten commented Nov 30, 2023 •

edited

Loading