Tag Archives: research

Ganakv2 Released

May 12, 2025Development, Research, SAT, Toolsmodel counting, Release, researchmsoos

Finally, after many months of waiting for our paper to be accepted (preliminary PDF here), Ganak2 is finally released (GitHub code, released binaries) and easily accessible and modifiable. I wish this release had come earlier, but double-blind requirements didn’t allow us to release the code any sooner.

Basically, Ganakv2 has been re-written in many ways. Some of the original ideas by Thurley for SharpSAT has been kept alive, but many-many things are completely different and completely re-written — well over 10’000 lines of change in total. This set of changes has allowed Ganak2 to win every available track at the Model Counting Competition of 2024. In other words, Ganak2 is incredibly fast, making it a state-of-the-art tool.

Chonological Backtracking, Enhanced SAT solving, S- and D-sets

The new Ganak has a pretty powerful SAT solver engine with its own restart strategy, polarity and variable branching strategy. This SAT solver is tightly integrated into Ganak, and reuses the same datastructures a Ganak, hence the counter/SAT transitions are smooth. This relies on the observation that once every variable in the (minimal) independent set has been assigned, there is at most one solution, so we can just run a SAT solver, there is no need to a full-blown model counter. This idea has existed at least since GPMC, but we improved on it in important ways.

A very complicated but very important aspect of Ganak2 is our integration of Chronological Backtracking which we adopted to the model counting setting. This required a lot of attention to detail, and in order to debug this, we took full advantage of the fuzzer written and maintained by Anna Latour. We further added very serious and thorough checking into Ganak2, with varying levels of internal checking and per-node checking of counts. This allows to perform effective debugging: the first node that the count is incorrect will immediately stop and error-out, along with a full human-readable debug log of what had happened. This was necessary to do, as some of the issues that Chonological Backtracking leads to are highly non-trivial, as mentioned in the paper.

Ganak2 now understands something we call a D-set, which is a set of variables that’s potentially much larger than the projection set but when branched on, the count is the same. This allows Ganak to branch on far more variables than any other current model counter. We run an algorithm that takes advantage of Padoa’s theorem to make this D-set as large as possible. Further, Ganak2 takes advantage of projection set minimization, calling he minimal set an S-set, and running a SAT solver as soon as all variables from the S-set have been decided. This allows us to run a SAT solver much earlier than any other model counter.

Component Discovery, Subsumption and Strengthening, and Counting over any Field

Since the paper, I have also added a number of pretty interesting improvements, especially related to component generation, which is now a highly cache-optimized datastructure, using time stamping, a trick I learned from Heule et al’s brilliant tree-based lookahead paper (worth a read). I also changed the hash function used to chibihash, it’s not only faster (on the small inputs we run it on) but also doesn’t rely on any specific instruction set, so it can be compiled to emscripten, and so Ganak2 can run in your browser.

There is also a pretty experimental on-the-fly subsumption and strengthening that I do. It actually rewrites the whole search tree in order to achieve this, which is a lot harder to do than I imagined. It’s the most trivial subsumption & strengthening code you can imagine, but then the rewriting is hell on earth. However, the rewriting is necessary in case we want to be able to continue counting without restarting the whole counting process.

Ganak, unlike many of its counterparts, also manages memory very tightly and tries not to die due to memory-out. The target memory usage of the cache can be given via –maxcache in MB. A cache size of 2500MB is default, which should cap out at about 4GB total memory usage. More cache should lead to better performance, but we actually won (every available track of) the Model Counting Competition 2024 with rather low memory usage, I think I restricted it to 8-12GB — other competitors ran with ~2x that.

Furthermore, Arjun, our CNF minimizer has also been tuned. While this is not specifically Ganak, it gives a nice improvement, and has some cool ideas that I am pretty happy about. In particular, it runs the whole preprocessing twice, and it incorporates Armin Biere’s CadiBack for backbone detection. I wish I could improve this system a bit, because sometimes CadiBack takes >1000s to run, but without it, it’s even slower to count in most cases. But not always, and sometimes backbone is just a waste of time.

The system also now allows to use any field, all you need to do is implement the +/-/*/div operators and the 0 and 1 constants. I have implemented integers, rationals, modulo prime counting, and counting over a the field of polynomials with rational coefficients. All you need to do is override the Field and FieldGen interfaces. I made it work for modulo prime counting in under 20 minutes, it’s like 30 lines of code.

Floating Point Numbers

Notice that floating point numbers don’t form a field, because 0.3+(0.2+0.1) is not equal to (0.3+0.2)+0.1, i.e. it’s not associative (– thanks to Martin Hořeňovský for this correction, previously I said “commutative”). Actually, Ganak2’s weighted model counting doesn’t currently support floating point numbers, because it doesn’t need to — we won literally every available track with infinite rational numbers: 0.3 is simply interpreted as 3/10. This sounds minor, but it also means that it’s actually trivial to check if Ganak runs correctly — it simply needs to be run with a different configuration and it should produce the same solution. Notice that this is not at all true for any other weighted model counter. Literally all of them, except for Ganak, will give different solutions if ran with different seeds. Which is hilarious, since there is obviously only one correct solution. But since 0.3+(0.2+0.1) is not equal to (0.3+0.2)+0.1 in floating point, they can’t do anything… rational.

This whole floating point saga is actually quite annoying, as it also means we can’t check Ganak against any other counter — at least not easily. All other counters give wrong answers, given that floating point is incorrect, and so we can only compare Ganak2 to other counters if we allow a certain deviation. Which is pretty hilarious, given that counters all claim to use “infinite precision”. In fact, the only thing infinite about their results is likely the error. Since (0.3+(0.2+0.1)) – ((0.3+0.2)+0.1) is not 0, it is actually possible to have theoretically infinite error in case of negative weights.

Approximate Counting and Ganak

Ganak2 incorporates ApproxMC, and can seamlessly transition into counting via approximate model counting methods. To do this, simply pass the “–appmct T” flag, where T is the number of seconds after which Ganak will start counting in approximate mode. This can only be done for unweighted counting, as ApproxMC can only do unweighted counting. However, this seamless transition is very interesting to watch, and demonstrates that using a multi-modal counting framework, i.e. exact+approximate is a very viable strategy.

What happens under the hood is currently unpublished, but basically, we stop the exact counting, check how much we have counted so far, subtract that from the formula, and pass it to ApproxMC. We then get the result from ApproxMC and add the partial count that Ganak counted, and display the result to the user. So the final count is partially approximate, and partially exact, so we actually give better approximation (epsilon and delta) guarantees than what we promise.

Compiling Ganak

The new Ganak incorporates 8 libraries: GNU MP, CryptoMiniSat, Arjun, ApproxMC, SBVA, CadiCal, CadiBack, and BreakID. Furthermore, it and compiles in (optinally) BuDDy, Oracle (from Korhonen and Jarvisalo) and Flowcutter. These are all individually mentioned and printed to the console when running. However, this means that building Ganak is not trivial. Hence, with the very generous help of Noa Aarts, Ganak has a flake nix, so you can simply:

git clone https://github.com/meelgroup/ganak
cd ganak
nix-shell

And it will reproducibly build Ganak and make it available in the path. All this needs is for one to have installed Nix, which is a single-liner and works incredibly reliably. Otherwise, the Ganak repository can be cloned together with its GitHub Actions, and then each push to the repository will build Ganak2 for 4 different architectures: Linux x86&ARM, Mac x86&ARM.

Submit your Version of Ganak to the 2025 Competition

Ganak2 is highly extensible/modifiable. I strongly encourage you to extend/change/improve it and submit your improvement to the Model Counting Competition of 2025. Deadline is in 3 weeks, it’s time to get your idea into Ganak and win. I promise I will have all my changes publicly available until then, and you can literally submit the same thing I will, if you like. I’d prefer if you put in some cool change, though.

CryptoMiniSat: 8000 commits later

October 18, 2015Personal, SATfree software, Parallel SAT Solving, programming, researchmsoos

The GitHub repository for CryptoMiniSat just hit 8000 commits. To celebrate this rather weird and crazy fact, let me put together a bit of a history.

The Beginnings

CryptoMiniSat began as a way of trying to prove that a probabilistic cryptographic scheme was not possible to break using SAT solvers. This was the year 2009, and I was working in Grenoble at INRIA. It was a fun time and I was working really hard to prove what I wanted to prove, to the point that I created a probabilistic SAT solver where one could add probability weights to clauses. The propagations and conflict engine would only work if the conflict or propagation was supported by multiple clauses. Thus began a long range of my unpublished work in SAT. This SAT solver, curiously, still works and solves problems quite well. It’s a lot of fun, actually — just add some random clause into the database that bans the only correct solution and the solver will find the correct solution. A bit of a hackery, but hey, it works. Also, it’s so much faster than doing that solving with the “right” systems, it’s not even worth comparing.
Continue reading →

Clause glues are a mystery to me

July 8, 2013Development, SATglucose, glues, research, visualisationmsoos

Note: the work below has been done in collaboration with Vegard Nossum, but the wording and some of the (mis-)conculsions are mine. If you are interested, check out his master thesis, it’s quite incredible

Anyone who has ever tried to really understand clause glues in SAT solvers have probably wondered what they really mean. Their definition is simple: the number of variables in the final conflict clause that come from different decision levels. An explanation of these terms can be found here. On the surface, this sounds very simple and clean: the number of different decision levels should somehow be connected to the number of variables that need to be set before the learnt clause activates itself, i.e. it causes a propagation or a conflict. Unfortunately, this is just the surface, because if you try to draw 2-3 implication graphs, you will see that in fact the gap between the number of variables needed to be set (let’s call this ‘activation number’) and the glue can be virtually anything, making the glue a potentially bad indicator of the activation number.

The original reasoning

The original reasoning behind glues is the following: variables in the same decision level, called ‘blocks of variables’ have a chance to be linked together through direct dependencies, and these dependencies should be expressed somehow in order to reduce the number of decisions needed to reach a conflict (and thus ultimately reduce the search space). To me, this reasoning is less clear than the one above. In fact, there are about as many intuitions about glues as the number of people I have met.

Glucosetrack

With Vegard Nossum we have developed (in exactly one day) something quite fun. On the face of it, it’s just glucose 1.0, plain and simple. However, it has an option, “-track”, which does the following: whenever a learnt clause is about to cause a conflict, it jumps over this learnt clause, saves the state of the solver, and works on until the next conflict in order to measure the amount of work the SAT solver would have had to do if that particular learnt clause had not been there. Then, when the next clause wishes to cause a conflict, it records the amount of propagation and decisions between the original conflict and this new conflict, resets the state to the place saved, and continues on its journey as if nothing had happened. The fun part here is that the state is completely reset, meaning that the solver behaves exactly as glucose, but at the same time it records how much search that particular cause has saved. This is very advantageous because it doesn’t give some magical number like glue, but actually measures the usefulness of the given clause. Here is a typical output:

c This is glucose 1.0  with usefulness tracking by
c Vegard Nossum and Mate Soos. Based on glucose, which
c is in run based on MiniSat, Many thanks to all teams
c
c ============================[ Problem Statistics ]=============================
c |                                                                             |
c |  Number of variables:  138309                                               |
c |  Number of clauses:    942285                                               |
c |  Parsing time:         0.28         s                                       |
============================[ Search Statistics ]==============================
| Conflicts |          ORIGINAL         |          LEARNT          | Progress |
|           |    Vars  Clauses Literals |    Limit  Clauses Lit/Cl |          |
===============================================================================
|         0 |  138309   942285  2636352 |   314095        0   -nan |  0.000 % |
|       620 |  138074   942285  2636352 |   314095      615     82 |  1.559 % |
|       919 |  135307   925938  2596895 |   314095      906     75 |  3.908 % |
|      2714 |  130594   894562  2518954 |   314095     2664     67 |  6.799 % |
|      2814 |  130593   894562  2518954 |   314095     2763     69 |  6.808 % |
|      2930 |  130592   894562  2518954 |   314095     2879     70 |  6.808 % |
|      4042 |  127772   874934  2471045 |   314095     3974     69 |  9.292 % |
|      4142 |  127772   874934  2471045 |   314095     4074     70 |  9.292 % |
...
c Cleaning clauses (clean number 0). Current Clause usefulness stats:
INSERT INTO data(cleanno, idx, size, glue, conflicts, props, bogoprops, decisions) VALUES(0 , 961074 , 107 , 5 , 42 , 185509 , 1301341 , 0);
INSERT INTO data(cleanno, idx, size, glue, conflicts, props, bogoprops, decisions) VALUES(0 , 944729 , 14 , 1 , 7 , 36865 , 268229 , 0);
INSERT INTO data(cleanno, idx, size, glue, conflicts, props, bogoprops, decisions) VALUES(0 , 948275 , 7 , 1 , 15 , 27909 , 220837 , 0);
INSERT INTO data(cleanno, idx, size, glue, conflicts, props, bogoprops, decisions) VALUES(0 , 953762 , 102 , 2 , 2 , 29365 , 197410 , 0);
...
c End of this round of database cleaning
...
|     38778 |  110896   758105  2182915 |   314095    28270     93 | 20.167 % |
|     39488 |  110894   758105  2182915 |   314095    28978     93 | 20.185 % |
c Cleaning clauses (clean number 1). Current Clause usefulness stats:
INSERT INTO data(cleanno, idx, size, glue, conflicts, props, bogoprops, decisions) VALUES(1 , 979236 , 71 , 1 , 8 , 45531 , 280156 , 0);
INSERT INTO data(cleanno, idx, size, glue, conflicts, props, bogoprops, decisions) VALUES(1 , 954908 , 2 , 2 , 7 , 0 , 232760 , 0);
...

The output is in SQL format for easy SQL import. The “size” is the clause size, “glue” is the glue number, “conflicts” is the number of times the clause caused a conflict, “props” is the number of propagations gained by having that clause (i.e. by doing the conflict early), “bogoprops” is an approximation of the amount of time gained based on the number of watchlists and the type of clauses visited during propagation, and “decisions” is the number of decisions gained. The list is sorted according to “bogoprops”, though once the output is imported to MySQL, many sortings are possible. You might notice that the ‘glue’ is 1 for some clauses (e.g. on the second output line) — these are clauses that have caused a propagation at decision level 0, so they will eventually be removed through clause-cleaning, since they are satisfied. Notice that high up, there are some relatively large clauses (of size 102 for example) with glue 2, that gain quite a lot in terms of time of search. The gained conflicts/propagations/etc. are all cleared after every clause-cleaning, though since clauses are uniquely indexed (‘idx’), they can be accumulated in all sorts of ways.

The program is 2-5x slower than normal glucose, but considering that it has to save an extreme amount of state due to the watchlists being so difficult to handle and clauses changing all the time, I think it does the job quite well — as a research tool it’s certainly quite usable. In case you wish to download it, it’s up in GIT here, and you can download a source tarball here. To build, issue “cmake .” and “make”. Note that the tool only measures the amount of search saved by having the clause around when it tries to conflict. It does not measure the usefulness of the propagations that a learnt clause makes. Also, it doesn’t measure the other side of the coin: the (potentially better) conflict generated by using this clause instead of the other one. In other words, it gives a one-sided view (no measure of help through propagation) of a one-sided view (doesn’t measure the quality of difference between the conflict clauses generated). Oh well, it was a one-day hack.

Experiments

I have made very few experiments with glucosetrack, but you might be interested in the following result. I have taken UTI-20-10p0, ran it until completion, imported the output into MySQL, and executed the following two queries. The first one:

SELECT glue, AVG(props), FROM data
WHERE glue >= 2 AND size >= 2
GROUP BY glue
ORDER BY glue

calculates the average number of saved propagations between each round of cleaning for clauses of glue >= 2 (i.e. clauses that didn’t eventually cause a propagation at decision level 0), and of size >= 2, because unitary clauses are of no interest. The second is very similar:

SELECT size, AVG(props), FROM data
WHERE glue >= 2 AND size >= 2
GROUP BY size
ORDER BY size

which calculates the same as above, but for size.

Some explanation is in order here regarding why I didn’t count SUM(), and instead opted for AVG(). In fact I personally did make graphs for SUM(), but Vegard corrected me: there is in fact no point in doing that. If I came up with a new glue calculation function that gave an output of ‘1’ for every clause, then the SUM for that function would look perfect: every clause would be in the same bracket, saving a lot of propagations, but that would not help me make a choice of which clauses to throw out. But the point of glues is exactly that: to help me decide which clauses to throw out. So what we really want is a usefulness metric that tells me that if I keep clauses in that bracket, how much do I gain per clause. The AVG() gives me that.

Here goes the AVG() graph for the last clause cleaning (clause cleaning iteration 33):

Notice that the y axis is in logscale. In case you are interested in a larger graph, here it is. The graph for clause cleaning iteration 22 is:

(Iteration 11 has high fluctuations due to less data, but for the interested, here it is). I think it’s visible that glues are good distinguishers. The graph for glues drops down early and stays low. For sizes, the graph is harder to read. Strangely, short clauses are not that good, and longer clauses are better on average. If I had to make a choice about which clauses to keep based on the size graph, it would be a hard choice to make: I would be having trouble selecting a group that is clearly better than the rest. There are no odd-one-out groups. On the other hand, it’s easier to characterise which clauses are good to have in terms of glues: take the low glue ones, preferably below 10, though we can skip the very low ones if we are really picky. An interesting side-effect of the inverse inclination of the size and glue graphs and the fact that “glue<=size” is that maybe we could choose better clauses to keep if we go for larger clauses that have a low glue.

Conclusions

Unfortunately, there are no real conclusions to this post. I guess running glucosetrack for far more than just one example, and somehow also making it measure the difference between the final conflict clauses’ effectiveness would help to write some partially useful conclusion. Vegard and me have tried to put some time and effort into this, but to not much avail I am afraid.

PS: Notice that glucosetrack allows you to generate many of the graphs here using the right SQL query.

CryptoMinisat 3.1 released

April 14, 2013Research, SATBCP, binary clauses, Optimisation, Release, research, stampingmsoos

CryptoMinisat 3.1 has been released. The short changelog is:

$ git diff cryptoms-3.0 cryptoms-3.1 --shortstat
 84 files changed, 3079 insertions(+), 2751 deletions(-)

The changes made were threefold. First, memory usage has been greatly reduced. This is crucial, because memory usage was over 7GB on certain instances. Secondly, the implication cache wasn’t very well-used and an idea that came to my mind greatly improved performance on most problems. Finally, time limiting of some inprocessing techniques on certain types of problems has been improved.

Memory usage reduction

On instances that produced a lot of long learnt clauses the memory usage was very high. These learnt clauses were all automatically linked in to the occurrence list and consequently took large amounts of memory, sometimes up to 10GB. On other instances, the original clauses were too numerous and too large, so putting even them into the occurrence list was too much. On these instances, variable elimination is not carried out (or carried out only later, when enough original clauses have been removed/shortened). To debug some of these problems, I wrote a fuzzer that generates extremely large problems with many binary and many long clauses, it’s available here as “largefuzzer”. It’s actually quite nice with many-many binary clauses so it also can fuzz the problems encountered with probing of extremely weird and large instances.

Implied literal usage improvement

CryptoMiniSat uses implied literals, i.e. caches what literals were propagated by each literal during probing. It then re-uses this information to subsume and/or strengthen clauses. This is kind of similar to stamping though uses more memory. It is actually useful to have alongside stamping, and I now do both — propagating DFS that stamping requires is expensive though updating cache during DFS is just as easy as during quasi-BFS.

The trick I discovered while playing around with cached implied literals is that if literal L1 propagates L2 and also !L2 then that means there are conceptually two binary clauses in the solver (!L1, L2), (!L1, !L2), so !L1 is TRUE. This is of course trivial, but I never checked for this. The question most would raise is: why would L1 propagate both L2 and !L2 and not fail? The answer is kind of tricky, but very interesting. Let’s say at one point, L1 propagates L2 due to a learnt clause, but that learnt clause is then removed. A new learnt clause is then later learnt, and with that learnt clause in place, L1 propagates !L2. Now, without caching, this would be ignored. Caching memorizes past conceptual binary clauses and re-uses this information.

This is not an optimization that only looks good on paper, it is very good to have. With this one optimization, I gained 5 instances from the SAT Comp’09 instances with a 1000s timeout (196 solved -> 201 solved). I can’t right now imagine how this could be done with stamping effectively, but that doesn’t mean it’s not possible. Though, according to my experience, stamping doesn’t preserve that much information over time as it’s being updated (renumbered) frequently while the cache is only improved over time, never shrunk. A possibility would be to have more than one stamp system and round-robin selecting them. However that would mean that sorting of clauses (for shrinking) would need to be done more than once, and sorting them is already relatively expensive. I sometimes feel that what stamping gains in memory it looses on sorting (i.e. processing time) and lower coverage (re-numbering).

More precise time-limiting

Martin Maurer has been kind enough to file a lot of bug reports about probing and variable elimination taking too much time, sometimes upwards of 150s when they should take around 20-30s maximum. While investigating, it tuned out that the problem was very weird indeed. While trying to eliminate or probe one variable the time for that one variable took upwards of 100s. This was completely unexpected as the code only checked for timeouts on a per-variable basis. In the end, the code had to be improved to track time on an intra-variable basis in both systems. While at it, I also added intra-variable time-tracking to implicit clause subsumption and strengthening too. So, over-times should less prevalent from now on. As an interesting side-note, time-limiting on probing is now so fine-grained that a 32-bit unsigned integer would overflow within 15s if used as the time-tracker.

CCC Camp’11

August 15, 2011Personal, ResearchCollaborative effort, cryptography, presentation, Release, researchmsoos

In case you’ve missed it, the CCC Camp was a great opportunity to meet people both working in security and otherwise. I have even met a very kind Taiwanese researcher who worked on SAT and Gröbner basis: in fact, if you haven’t had the chance to read this paper, I highly recommend it. A set of kind Taiwanese researchers recommended this paper to me, and I think it’s the most interesting SAT paper I have read in the past year.

We at SRLabs have made two releases during this camp, one that breaks GPRS encryption, and one that breaks smart card ROM encryption. I was involved with the first release, essentially working on the crypto part. In case you are interested in the videos, the one on GPRS is uploaded here, and the one on smart card ROM encryption is here. This reminds me of something: the videos from the MIT SAT/SMT Summer School are missing :( Well, given my fail there, maybe that’s a good thing :)

Wonderings of a SAT geek

A blog about SAT solving and cryptography