Since CryptoMiniSat has been getting quite a number of awards with parallel SAT solving, it’s about time I talk about how it does that.There is a ton of literature on parallel SAT solving, and I unfortunately I have barely had time to read any of them. The only research within the parallel SAT solving area that I think has truly weathered the test of time is Plingeling and Treengeling — and they are really interesting to play with. The rest most likely has some merit too, but I am usually suspect as the results are often –unintentionally — skewed to show how well the new idea performs and in the end they rarely win too many awards, especially not in the long run (this is where Plingeling and Treengeling truly shine). I personally haven’t published what I do in this scene because I have always found it to be a bit too easy and to hence have little merit for publication — but maybe one day I will.
Note: by unintentionally skewed results I mean that as you change parameters, some will inevitably be better than others because of randomness in the SAT solving. This randomness is easy to mistake for positive results. It has happened to everyone, I’m sure, including myself.
Exploiting CryptoMiniSat-specific features
CryptoMiniSat has many different inprocessing systems and many parameters to turn them on/off or to tune them. It has over 60k lines of code which allows this kind of flexibility. This is unlike the Maple*/Glucose* set of solvers, all coming from MiniSat, which basically can do one thing, and one thing only, really well. That seriously helps in the single-threaded setup, but may be an issue when it comes to multi-threading. They have (almost) no inprocessing (there is now vivification in some Maple* solvers), and no complicated preprocessing techniques other than BVE, subsumption and self-subsuming resolution. So, there is little to turn on and off, and there are very few parameters — and the few parameters that are there are all hard-coded into the solver, making them difficult to change.
CryptoMiniSat in parallel mode
To run in parallel mode, CMS takes advantage of its potential heterogeneity by running N different threads, each with radically different parameter settings, and exchanging nothing but unit and binary clauses(!) with the most rudimentary locking system. No exchange of longer clauses, no lockless exchanges, no complicated multi-lock system. One lock for unit clauses, one for binary, even for 24 threads. Is this inefficient? Yes, but it seems good enough, and I haven’t really had too many people asking for parallel performance. To illustrate, here are the parameter sets of the different threads used and here is the sharing and locking system. It’seriously simple, I suggest you take a peek, especially at the parameter sets.
Note that the literature is full of papers explaining what kind of complicated methods can be used to exchange clauses using different heuristics, with pretty graphs, complicated reasoning, etc. I have to admit that it might be useful to do that, however, just running heterogeneous solvers in parallel and exchanging unit&binary clauses performs really well. In fact, it performs so well that I never, ever, in the entire development history of 7 years of CMS, ran even one full experiment to check parallel performance. I usually concentrate on single-threaded performance because checking parallel performance is really expensive.
Checking the performance of a 24 thread setup is about 15x more expensive than the single-threaded variant. I don’t really want to burn the resources for that, as I think it’s good enough as it is. It’s mostly beating solvers with horrendously complicated systems inside them with many research papers backing them up, etc. I think the current performance is proof enough that making things complicated is not the only way to go. Maybe one day I will implement some more sophisticated clause sharing, e.g. sharing clauses that are longer than binary and then I won’t be able to claim that I am doing something quite simple. I will think about it.
Conclusions
I am kinda proud of the parallel performance of CMS as it can showcase the heterogeneity of the system and the different capabilities of the solver. It’s basically doing a form of acrobatics where the solver can behave like a very agile SAT solver with one set of parameters or like a huge monolith with another set of parameters. Since there are many different parameters, there are many different dimensions, and hence there are many orthogonal parameter sets. It’s sometimes interesting to read through the different parameter settings and wonder why one set works so much better than the other on a particular type of benchmark. Maybe there could be some value in investigating that.