FPGA programming and SAT

I have lately been trying to implement a cryptographic protocol (CryptoGPS) on an FPGA. First impressions are that VHDL is a difficult language, but once understood, it’s not all that difficult to work with it. It took me ~2 weeks to get a hold on the language, but now it feels relatively easy to program in it. I have been looking forward to getting down and dirty with FPGAs, because it’s one of my long-term goals to somehow integrate SAT solvers with FPGAs.

SAT solvers are extremely versatile tools that can resolve a number of very complex issues with relative ease. FPGAs are also extremely versatile (essentially,  they are reprogrammable hardware), and are very fast. Making once piece out of these two is like merging Batman with Superman. The trick is, of course, how to merge them? The problem with merging is that FPGAs are typically very good at executing a (preferably small) set of instructions over and over again on a relatively small dataset, while SAT solvers usually deal with large datasets, and execute an excruciatingly complex set of instructions on them over and over again.

I think a way to resolve this issue could be to use the FPGA as a sort of oracle for the SAT solver. The solver is running on a normal computer and at the same time it is connected to an FPGA that awaits its orders. The SAT solver regularly extracts a small set of constraints from its constraint database, sends it to the FPGA, which tries every possible combination of the variables, and sends back the result to the SAT solver. While the FPGA works its wonders, the SAT solver could be executing the normal SAT procedure, and regularly process the incoming data from the FPGA.

The FPGA could simply try every single combination of every variable setting on the clauses it received, and return compact information such as: “variable 5 must be TRUE” or “there is no solution”. More complex return values, such as “variable 10 is always equal to variable 2” could be returned, but it would make for a more complex VHDL design, leading to an overall slowdown. The simplier the logic, the more we can fit on a given FPGA, which means more parallelisation, and therefore a speedier solution overall. Since the number of variables in the clauses will be limited (checking for 30 is already at least 2^30 operations), we could store the clauses as constant-length double-bitfields in the FPGA, and use a bitmask for the current variable settings. This could allow for massively parallelised evaluation of clauses.

On the SAT solver side, we need some guiding principles along which to pick the constraints for the FPGA. The usual suspects could be constraints that contain the most active variables, for example. More complex metrics could minimise the number of variables inside the constraints while maximising the number of small constraints they are present in. Other metrics could perhaps concentrate on the current restart branch of the SAT solver, and try to ask the FPGA to prove that below a certain decision level, the current branch cannot contain a solution — this way, we could backjump using the data from the FPGA.

Would this architecture work? I think its performance would depend on whether there is any compact information that the FPGA could extract from the clauses sent to it. This depends on whether there are small “cores” in the problem that contain compact information, and whether we can extract these cores. The existence of such information-rich cores depend on many things. The clauses/variable ratio is one such metric, another would be the number of symmetries in the problem: the more symmetries, the less compact information can be extracted. Cryptographic problems come to my mind when seeing these criteria. They usually have a very high clauses/variables ratio and rarely, if ever, contain symmetries.

Creating a SAT solver from scratch

I have lately been having a small chat with Francois Grieu and it inspired me to think about how difficult it would be to create a new SAT solver from scratch. Let’s take it for granted that the person who embarks on such a difficult task understands the basics well, and wishes to create a (nowadays very popular) Conflict-Driven Clause-Learning (or simply, CDCL) type SAT solver. The main parts of  these solvers are:

  1. Clause parser&database to store the clauses
  2. Propagation engine using watch-lists to carry out Boolean Constraint Propagation (BCP)
  3. Conflict generation code, using the first-UIP scheme
  4. Learnt clause activity calculator&deleter
  5. Variable state database, storing activity and polarity
  6. Search restart manager
  7. Pre-processor is recommended — SatELite does the job reasonably well.

None of these are very difficult to do. However, by implementing these in a minimalistic fashion, one will only end up with a reasonably fast, but not a very fast solver. Enhancements, for example, storing binary clauses implicitly in the occurrence lists in the database bring small, but nevertheless important, benefits. Similarly, having specialised watch-lists to handle binary and tertiary clauses further speed up the solving. Conflict clause minimisation, on-the-fly subsumption, and on-the-fly self-subsuming resolution improve conflict generation. Polarity-calculation and -caching comes handy for variable polarities, and dynamic restarts are also useful. Learnt clause activities might need to support both MiniSat-style and Glucose-style activities.

The real challenge, however, comes with bullet number 7: a fast and complete pre- and in-processor is very complex to write. The pre-processing done by SatELite, i.e. self-subsuming resolution, variable elimination and subsumption are just the icing on the cake for a good pre- and in-processor. Doing these in the middle of the solving (i.e. in-processing), using learnt clauses, doing asymmetric branching, binary clause reasoning, variable replacement, blocked clause elimination, failed literal probing, tautology elimination, clause strenghtening with tautological binary clauses, clause cleaning, and a long list of others — that’s where the hard part lies.

With all of these challenges facing SAT solver implementers, is it reasonable to think that anyone can relatively easily participate in SAT Races and SAT Competitions? If one wishes to respond with yes to this question, then there are two paths ahead: either immense amounts of time need to be invested into writing such a solver, or an already written solver has to be taken as a base to build ideas upon.  I think the only reasonable choices for a base are: lingeling, MiniSat, PrecoSat and (I hope to think) CryptoMiniSat. None of these solvers are perfect of course, so one has to put up with all the inherent problems in these solvers.

In the past months, I have been writing many blog entries trying to detail the inner workings of SAT solvers — mostly to help others (and in the process, myself) struggling with SAT solvers. Maybe I should be spending more time trying to document the code of CryptoMiniSat, so that the barrier for entry to creating winning SAT solvers could be lowered.

Truth table to ANF conversion

It is sometimes important to be able to quickly convert a binary function, described in as a truth table, to its Algebraic Normal Form (or simply, ANF). The truth table basically lists the points where a function is 1 and when it is 0. For instance, if the function has two variables, and its truth table is 0110 or, in hexadecimal form, 0x5 then the ANF of this function is simply x0 + x1, where x0 is the first variable of the function, and x1 is the second.

When it comes to more complex functions, an automatic method for this conversion is helpful. The Sage software can do exactly that:

sage: from sage.crypto.boolean_function import BooleanFunction
sage: B = BooleanFunction("0123456789ABCDEF")
sage: B.algebraic_normal_form()
x0*x1*x2 + x0*x1*x3 + x0*x1*x4 + x0*x1*x5 + x0*x2
+ x0*x3 + x1*x2 + x1*x4 + x2 + 1

Here, we used Sage to convert the function given by the truth table (in hex) 0x0123456789ABCDEF to its ANF equivalent.

I have long searched for a method for this, but at the end it was Luk Bettale who showed me the way. Many thanks to him for this!

CryptoMiniSat 2.6.0 released

Version 2.6.0 of the CryptoMiniSat SAT solver has been released. The new release incorporates a number of important additions and discoveries. Additions include a better memory manager and better watchlists, while the discoveries include an interesting use of asymmetric branching and an on-the-fly self-subsuming resolution algorithm.

The new solver can now solve 221 problems from the SAT Competition of 2009 within the same timing&CPU constraints, while the 2.5.0 (i.e. SAT Race) version could only solve 217. Here is a comparison plot:

The cut-off for the competition for these machines was approx at 5000 sec. As can be seen from the graph (which goes until 6000 sec), the new solver does even better in the longer run:  the additions seem to improve the long-term behaviour.

As for the future, I think there is still a lot of things to do. For example, the solver still doesn’t have blocked clause elimination, which could help, and it is still missing some ideas that others have published. Notably, it doesn’t do on-the-fly subsumption at every step of the conflict generation process, only at the very last step. In case you are interested to add any of these in a transparent manner, feel free to hack away at the git repository.

Edited to Add (3/09/2010): The performance of CryptoMiniSat on the SAT Race 2010 problems has also changed with the new version. The new CryptoMiniSat can now solve 75 problems (instead of 74) within the time limit. More importantly, the average time to solve these 75 instances has decreased considerably, to around 111 seconds per instance (from 138 s/inst), which is very close to the results of lingeling, an extremely fast and very advanced solver. There is still a lot to be learnt from lingeling, however: its memory footprint is far smaller, and its preprocessing techniques in some areas are much better than that of CryptoMiniSat.

On-the-fly self-subsuming resolution

I have recently been trying a new method of shortening learnt clauses. There is a learnt clause minimisation paper by Sörensson and Biere, and I have recently been trying to do more. The trick I use, is that many literals can be removed from learnt clauses, if self-subsuming resolution (see my older post) is applied to them.

During self-subsuming resolution, under normal circumstances, the clauses used to remove literals with are the short clauses: binary and tertiary clauses. The trick I have discovered, is that since CryptoMiniSat keeps binary and tertiary clauses natively inside the watchlists (see my previous post), so doing self-subsuming resolution can be done extremely fast. We simply need to put the literals of the newly learnt clause into a fast-addressable memory, then go through the watchlists of every literal in the clause, and remove the literals that match.

The code used to achieve this is the following:

for (uint32_t i = 0; i < cl.size(); i++) seen[cl[i].toInt()] = 1;
for (Lit *l = cl.getData(), *end = cl.getDataEnd(); l != end; l++) {
    if (seen[l->toInt()] == 0) continue;

    Lit lit = *l;
    //watched is messed: lit is in watched[~lit]
    vec& ws = watches[(~lit).toInt()];
    for (Watched* i = ws.getData(), *end = ws.getDataEnd(); i != end; i++) {
        if (i->isBinary()) {
            if (seen[(~i->getOtherLit()).toInt()]) {
                seen[(~i->getOtherLit()).toInt()] = 0;
            }
        }
    }
}

Lit *i = cl.getData();
Lit *j= i;
for (Lit* end = cl.getDataEnd(); i != end; i++) {
    if (seen[i->toInt()]) *j++ = *i;
    seen[i->toInt()] = 0;
}
cl.shrink(i-j);

Essentially, we set the seen vector (which is initially all 0), to 1 where the original clause contained a literal. Then, we go through the watchlists, and check if any binary clause could strengthen the clause. If so, we unset the corresponding bit in seen. Finally, we clean the clause from the literals where seen is 0 (and then set all parts of seen back to 0).

The actual implementation only differs from the one above by also using tertiary clauses (which are also natively inside the watchlists) to do self-subsuming resolution. The results are very promising: CryptoMiniSat can now solve 2 more problems from the 2009 SAT Competition instances within the original time limit, and seems to scale better in the longer run, too. An example statistics output of CryptoMiniSat (from the UTI-20-10p1 problem):

c OTF cl watch-shrink   : 961191      (0.64      clauses/conflict)
c OTF cl watch-sh-lit   : 6096451     (6.34       lits/clause)

which means that in 64% of the cases, on-the-fly self-subsuming resolution was useful, and on average, it removed 6.3 literals from each clause where it was useful.