Tag Archives: model counting

Ganak: The Making of a Versatile, High Performance Model Counter

Ganak (github), our propositional model counter, has won every single model counting competition track for the past two years. Perhaps it’s time to explain how that came about, and what are the ingredients of such a tool. Firstly, a propositional model counter is a tool that counts how many solutions there are to a set of boolean constraints of the form:

a V b
b V -c

This set of constraints has the following possible solutions:

abc
---
000
001
010
011
100
101
110
111

But the ones highlighted in bold are not solutions, because the constraints make them incorrect. So we are left with 8-3=5 solutions. There is one more twist. Sometimes, we are only interested in the solutions over a set of variables, which we will call the sampling set, say “a” and “b”. Then we are only left with the solutions “01x”, “10x”, and “11x”, a total of 3 solutions, where “x” means we don’t care about the value , since it’s not in the sampling set.

The History of the Preprocessor: Arjun

Arjun (github) came about because we saw the B+E tool doing minimisation of the sampling set. Minimising the sampling set is important, because if the sampling set is e.g. only of size 10, then there are at most 2^10 things to count, but if it’s 20, then there are 2^20. It’s possible to minimise the sampling set sometimes: for example when we can prove that e.g. a=b, there is no need to have both “a” and “b” in the sampling set. Tricks like this can significantly lower the complexity of counting solutions. We enhanced B+E, and published this as a paper here.

In the end, it turned out that minimising the sampling set was really only the beginning: we also needed to make the input smaller. The fewer the constraints, and the fewer the number of variables, the easier it is to deal with the formula. This was not new to anyone, but it turns out that this minimisation was hard. Others have tried, too, but my advantage was that I came from the SAT world, with the CryptoMiniSat SAT solver under my belt, so I wrote an entire API (blogpost) for my SAT solver to be used as an input minimiser. This allowed me to try out many heuristics and write code easily, taking advantage of all the datastructures already inside CryptoMiniSat for constraint handling.

The History of the Counter: Ganak

The actual counter, Ganak (github), was something of another story. It started with a hack that my long-time collaborator Kuldeep Meel and I did in 2018, for a long evening and night in Singapore. This lead to the original Ganak paper which essentially added hashing to the system, thereby making it probabilistic, but also use a lot less memory, and thereby faster. I personally haven’t touched that project much, and instead focussed all my energies on the preprocessor, Arjun. I was pretty convinced that if I could make the preprocessor good enough, no matter what Ganak looked like, it would be good enough.

This strategy of focussing on Arjun mostly paid out, we did well in model counting competitions. The counter d4 was the main competitor at the time. Then SharpSAT-TD (paper) by Korhonen and Jarvisalo and came and it blew everything out of the water. That made me look at the Ganak source code first time in about 5 years, and I was appalled. We were putting a donkey, Ganak, on the rocket, Arjun, that I built, and we were doing OK, but looking at SharpSAT-TD, we were clearly racing a donkey against a cheetah. So In the spring of 2023 I sat down one morning and started replacing things I hated about Ganak, which was mostly everything.

Thus, the Ganak we know today was born. This gradual rewriting took about 2-3 years, continued during the summer and then the next spring. The most significant parts of this rewrite were that we cleaned up the code, added extensive fuzzing and debugging capabilities to it, added/adapted a lot of well-known SAT tricks for counting, integrated the tree decomposition (“TD”) system and parts of the preprocessor from SharpSAT-TD, and came up with a way to not only minimise the sampling set, but also the extend it, which can help in certain cases.

Ideas in Ganak and Arjun

We wrote a paper about Ganak (paper here) but to be honest, it only covers a very small ground of what we did. The change to Ganak between the original paper in 2019 to today is about 30’000 lines of code. Obviously, that cannot be explained in detail in just a few pages. Furthermore, the way academic publishing works, it’s not possible to simply list a bunch of ideas that were implemented — you need to explain everything in detail, which cannot be done with 30’000 lines of code. So, below is a bit of an “idea dump” that I have implemented in Ganak, but never published.

I might attempt at publishing some of them some day. Currently, I see little reward for doing so besides citations, and citations are not a great indicator in my view: lcamtuf is one of the best IT security professionals, ever, and has basically no citations. Besides, due to the “publish or perish” system in academia, there is a sea of research papers that most work gets lost in. I believe what matters most is performance and usability. Ganak does pretty well in this regard. It compiles to many platforms: Linux, Mac, both ARM and x86, and even emscripten in your browser. It supports more weight systems than other counters: integers, floats, exact rationals, complex numbers, primes, and multivariate polynomials over the rationals and floats. Ganak also uses less memory than other competing counters, while running faster. Besides, it’s regularly fuzzed and leaks no memory, so its output is rather trustworthy.

Without further ado, here are some ideas never published in Ganak, but which help the system run faster:

  • Using and improving the Oracle SAT Solver Korhonen and Jarvisalo (PDF by the authors). Oracle is a specialised SAT solver that allows one to query with many so-called “assumptions” in the SAT query, i.e. checking if something is satisfiable under conditions such as “a=true, b=false” — but many-many of them. This kind of query is very slow if you use a normal SAT solver, but with Oracle, it’s super-fast. This allows us to remove unneeded constraints, and make the constraints smaller. I improved this tool in a few ways, mostly by integrating a local search solver into it, that can drastically speed it up, by finding a satisfying solution much faster. I have also improved its heuristics, e.g. by using a better LBD heuristic (PDF by authors). I have also improved its solution cache, by employing cache performance tracking, pre-filtering, and cache size limitation. Besides, I added a step-tracking to it so it can be deterministically limited in time.
  • Improved version of tree decomposition originally by Korhonen and Jarvisalo (PDF here). This system computes the tree decomposition of a CNF via Flow-Cutter by Ben Strasser (PDF, code), however, the original system had a few issues. Firstly, it did a clock-limited run of the executable, which was unacceptable to me, as it makes the system non-deterministic: depending on the CPU it will run faster/slower thereby computing different values (it’s a local search algorithm, with many restarts). Also, running separate executables is very fragile. Secondly, and more importantly, it computed the tree decomposition of a CNF and then tried to find its centroid. But… what if the CNF was disjoint? What’s the single centroid of a forest? Slight issue. I fixed this by computing the top-level disjoint components and counting them separately. Although disjoint components sound really like an input issue, and not a counter issue, the problem is that our preprocessing is so strong that it can make a non-disjoint input into a disjoint input, thereby confusing the TD/centroid finding. Ooops.
  • Special-casing some inputs. Because we detect disjoint top-level components, sometimes we end up with CNFs that are very weird, for example, with a sampling set size of 2. These CNFs can only have at most 4 solutions, so counting them via the incredibly complicated d-DNNF system is an overkill, likely to waste a lot of time via the start-up and tear-down. Hence, these CNFs are counted one-by-one via the trivial use of a standard SAT solver.
  • Using CCNR by Shaowei Cai (PDF by authors) for speeding up the Oracle query. It’s a cool local search system, and it works really well. I wish I had fixed up its code because it’s a bit clunky and my colleagues gave me a hard time for it. Not research colleagues, of course. In research, quality of code is irrelevant — remember, only the number of citations matters.
  • Adapting Vivification by Li et al. (PDF paper) to the model counting setting. The original system stops the SAT solver once when the solver restarts, rewrites all the clauses to be shorter when possible, and continues happily every after. This happily ever after is impossible in a model counting setting, because we never restart. Slight issue. What I wrote was basically a nightmare-fuelled 2-week, probably about ~2000 line craziness that rewrites the propagation tree and/or avoids the clause improvement, so that all the invariants of the propagation remain intact while the clauses get shorter. This is kind of like changing the wheels on a car while it’s running. I don’t recommend doing this without extremely thorough fuzzing and debug tooling in place, or you’ll run into a lot of trouble. (By the way, this system not only contracts clauses, but also performs subsumption — if I have built the system, I might as well use it all the way)
  • Using an virtual interface for the Field we are counting over. With this system, the Field we are counting over becomes irrelevant for the counter. Instead, the counter can focus on counting, and the implementation of the interface can focus on the Field-specific operations. This has allowed me to add many-many Fields with very little overhead. The Field parses, prints, and handles the 0, and 1 points, and the + and * operators. This separation of concerns is I think the right way to go. Also, it allows someone to implement their own Field and use it without having to recompile anything.
  • Zero-suppressed Field counting. When counting over a field, there are two special elements one needs to take care of: zero and one. Of this, the zero is quite special, because a+0=a, and a*0=0. So, in Ganak, we don’t initialise the zero field element. Instead, we keep it as a null pointer. When manipulating this pointer, like adding or multiplying, we can simply replace it with a copy of what we add to it, or, when multiplying, keep it a null pointer. This saves space and also reduces the number of pointer dereferences we need to perform.
  • Extensive fuzzing infrastructure by Latour et al (code), with ideas taken from Niemetz at al (PDF here). Ganak and Arjun both have many options that allow the fuzzer to turn off certain parts of the system, or push them to the extreme, thereby exercising parts of them that would otherwise be impossible to reach. The paper I linked to explains how adding the options to the fuzzer, and setting them to all sorts of values, can help the fuzzer reach very hard-to-reach parts of the code, thereby exposing bugs in them easier.
  • Extensive debug infrastructure. Finding issues via fuzzing is only part of the deal, one must also be able to debug them. For this, Ganak has a 4-level debug system where progressively slower self-checking can be turned on so as to find the first place where the bug manifests itself as precisely as possible. The last level of debug checks every single node in the d-DNNF that Ganak builds, and exits out on the first node that the count is incorrect. The debug infrastructure also comes with a verbosity option that prints out full, human readable, structured, coloured logs for each major decision point in the d-DNNF. Overall, just the debug code of Ganak is likely 2-3 thousand lines of code. This may sound excessive, but Ganak can self-check its state at every node, and ensure that there is at least a path forward to counting the correct count, at almost all nodes. Apparently some have attempted to do what we did in our paper on Ganak, but bumped into issues they couldn’t resolve. I can confirm that without the appropriate fuzz and debug infrastructure, it would have been impossible for us to figure the things out we published in that paper.
  • Cache-optimised, lazy component finding. When a decision and propagation happens in the CNF, we must check whether the system has fallen into disjoint components. If so, we can count the components separately, and then multiply their counts — a property of a Field. This greatly helps in doing a lot less work. However, this means we must examine every single variable in the system, and see if it’s connected to the others through clauses at every node — often millions, or even 100s of millions of times during counting. Normally, d-DNNF counters do this by going through all the so-called occurrence list of all variables, recursively, and see if they encounter all variables. The issue with this is that it is extremely expensive, up to 50% of the total runtime, and furthermore, it re-examines parts of the clauses that were not touched by the newly set variables. However, it’s not easy to fix this: doing something very smart to a system that is very fast but dumb can slow down the system, since the smart thing can often take more time to compute than doing the dumb thing fast. So I pulled an old trick out of a hat, one that I learned from a paper by Heule et al.: time stamping. Basically, you keep a time stamp for each data element you touched, and as long as you can decide cheaply that you don’t need to recompute something based on the timestamp, you are good. We keep stamping each variable when its value is changed, and then we know what parts of the clauses need to be re-examined, based on the stamp on the clause and the stamp on the variable. I implemented this in a cache-optimized way, similarly how SharpSAT does it, laying it all out flat in the memory, putting clauses next to each other that will be dereferenced after one another — a trick I learned form MiniSat.
  • Prefetching of watchlists, and using blocked literals. Prefetching of watchlists is one of the very few things in SAT solving that was my idea. Basically, whenever a literal is set, its watchlist will be examined shortly. Hence, we can prefetch the watchslit the moment the literal is assigned, so as to prevent the CPU from stalling when the literal’s watchlist is invariably examined. This prefetching can be extended to the component finding, except there it’s occurrence lists and not watchlists. Secondly, I added blocked literals, another cache-optimising trick, by Chu et al (paper). Blocked literals are widely used by modern SAT solvers. SharpSAT missed it because it was written before this trick was known, so I added it in.
  • Recursive Conflict Clause Minimisation by Sörensson and Biere (paper). This was a trivial lift-and-shift from the MiniSat source code. There’s nothing much to say here other than it’s pretty well-known thing, but was not known at the time of SharpSAT. I believe there is nowadays a much better system by Fleury and Biere that does efficient all-UIP minimisation (paper). If you wanna lift-and-shit that code into Ganak and win the model counting competition, be my guest — all Ganak is always open source and online at all points in time, I have no energy to hide code.
  • Tight monitoring and management of memory usage. One of the first things you will notice with some model counters is that they eat all the RAM you have and then get killed by the kernel for lack of memory. This is unfortunately encouraged by the model counting competition guidelines which give massive amounts of memory, up to 64GB, to use by the counters. However, when you are testing your counter, almost all cluster systems have approx 4GB of memory per core (or less) in their nodes. The cluster I use has 160 cores per node and each node has 768GB of RAM, giving 4.8GB/core. Hence, if I want to test my counter without wasting resources, it should use at most 4.8GB of memory. Since I wanted to win the model counting competition, and for that I needed to test my system a lot, I optimised the system to use about 4.5GB memory at most. This is approx 10x less than what many other counters use. The way I did this is by (1) making sure no memory is leaked, ever, (2) precisely accounting for all memory usage and (3) deleting cache items that are occupying too much memory, ruthlessly. This required a using the valgrind memory leak detector, and a lot of use of valgrind’s massif heap profiler for many different types of inputs. This ensured that Ganak uses only the required amount of memory, and can safely and efficiently run in memory-constrained environments. In fact, Ganak would have won all tracks in both previous model counting competitions with only 1/10th of the allowed memory use (i.e. 6.4GB instead of 64GB).
  • A few more things that I can barely remember. For example, watchlists or clauses(?) were reverse-visited. Binary clauses were in the same watchlists as the normal clauses, but they can use a specialised watchlist and be visited faster. Clauses were removed from the watchlist using the “switch-to-the-end” trick which has been shown to be less efficient than simply overwriting. The original Ganak allocated memory for the component, hashed it, then deallocated the memory, but it could have called the hash in-situ, without the allocation, copy, and de-allocation. The original Ganak also used a hash function that was optimised to hash very large inputs, which made it impossible to compile to emscripten due to the CPU intrinsics used, and besides, it was unnecessary because it only had to hash a few KB at most. So I switched to chibihash64 by nrk.

The above took about 3-4 years of work, on-and-off, since I don’t normally get paid to do research. The total time I worked on Ganak being paid to do it was about 4-5 months. So it was mostly just passion and having some fun. The code at least is not absolutely unreadable, and there are a lot of seatbelts around, about 5-10% of the code is asserts, and there are entire sets of functions written solely to do self-checking, debug reporting, etc.

I will likely not publish any of the above ideas/improvements. Some are, I think, publishable, for example the the vivification, but especially that code is nightmare-inducing to me. The fuzzing and debugging while important to me, as I am interested in tools that work, is hard to publish and not too novel. Memory management again falls into this weird place where it’s not very novel but necessary for usable tools. Supporting many fields is just a basic requirement for a well-functioning system, besides, it’s super-easy to do, if set up right. The rest is just basic copy-paste with minor adjustment. I think the use of CCNR is actually quite fun, but I hardly think it’s worth a paper. It’s shaving 30-40% time off of a very slow-running (but necessary) part of the preprocessor, and I was very happy when I discovered it.

I hope you appreciated this somewhat long list. The code can of course be examined for all of them, and you can lift-and-shift some/all the ideas into other tools and other systems. I left quite a lot of comments, and if you turn on VERBOSE_DEBUG and set verbosity very high (say, “–verb 100”) you should be able to see how all of them work in tandem.

Ganakv2 Released

Finally, after many months of waiting for our paper to be accepted (preliminary PDF here), Ganak2 is finally released (GitHub code, released binaries) and easily accessible and modifiable. I wish this release had come earlier, but double-blind requirements didn’t allow us to release the code any sooner.

Basically, Ganakv2 has been re-written in many ways. Some of the original ideas by Thurley for SharpSAT has been kept alive, but many-many things are completely different and completely re-written — well over 10’000 lines of change in total. This set of changes has allowed Ganak2 to win every available track at the Model Counting Competition of 2024. In other words, Ganak2 is incredibly fast, making it a state-of-the-art tool.

Chonological Backtracking, Enhanced SAT solving, S- and D-sets

The new Ganak has a pretty powerful SAT solver engine with its own restart strategy, polarity and variable branching strategy. This SAT solver is tightly integrated into Ganak, and reuses the same datastructures a Ganak, hence the counter/SAT transitions are smooth. This relies on the observation that once every variable in the (minimal) independent set has been assigned, there is at most one solution, so we can just run a SAT solver, there is no need to a full-blown model counter. This idea has existed at least since GPMC, but we improved on it in important ways.

A very complicated but very important aspect of Ganak2 is our integration of Chronological Backtracking which we adopted to the model counting setting. This required a lot of attention to detail, and in order to debug this, we took full advantage of the fuzzer written and maintained by Anna Latour. We further added very serious and thorough checking into Ganak2, with varying levels of internal checking and per-node checking of counts. This allows to perform effective debugging: the first node that the count is incorrect will immediately stop and error-out, along with a full human-readable debug log of what had happened. This was necessary to do, as some of the issues that Chonological Backtracking leads to are highly non-trivial, as mentioned in the paper.

Ganak2 now understands something we call a D-set, which is a set of variables that’s potentially much larger than the projection set but when branched on, the count is the same. This allows Ganak to branch on far more variables than any other current model counter. We run an algorithm that takes advantage of Padoa’s theorem to make this D-set as large as possible. Further, Ganak2 takes advantage of projection set minimization, calling he minimal set an S-set, and running a SAT solver as soon as all variables from the S-set have been decided. This allows us to run a SAT solver much earlier than any other model counter.

Component Discovery, Subsumption and Strengthening, and Counting over any Field

Since the paper, I have also added a number of pretty interesting improvements, especially related to component generation, which is now a highly cache-optimized datastructure, using time stamping, a trick I learned from Heule et al’s brilliant tree-based lookahead paper (worth a read). I also changed the hash function used to chibihash, it’s not only faster (on the small inputs we run it on) but also doesn’t rely on any specific instruction set, so it can be compiled to emscripten, and so Ganak2 can run in your browser.

There is also a pretty experimental on-the-fly subsumption and strengthening that I do. It actually rewrites the whole search tree in order to achieve this, which is a lot harder to do than I imagined. It’s the most trivial subsumption & strengthening code you can imagine, but then the rewriting is hell on earth. However, the rewriting is necessary in case we want to be able to continue counting without restarting the whole counting process.

Ganak, unlike many of its counterparts, also manages memory very tightly and tries not to die due to memory-out. The target memory usage of the cache can be given via –maxcache in MB. A cache size of 2500MB is default, which should cap out at about 4GB total memory usage. More cache should lead to better performance, but we actually won (every available track of) the Model Counting Competition 2024 with rather low memory usage, I think I restricted it to 8-12GB — other competitors ran with ~2x that.

Furthermore, Arjun, our CNF minimizer has also been tuned. While this is not specifically Ganak, it gives a nice improvement, and has some cool ideas that I am pretty happy about. In particular, it runs the whole preprocessing twice, and it incorporates Armin Biere’s CadiBack for backbone detection. I wish I could improve this system a bit, because sometimes CadiBack takes >1000s to run, but without it, it’s even slower to count in most cases. But not always, and sometimes backbone is just a waste of time.

The system also now allows to use any field, all you need to do is implement the +/-/*/div operators and the 0 and 1 constants. I have implemented integers, rationals, modulo prime counting, and counting over a the field of polynomials with rational coefficients. All you need to do is override the Field and FieldGen interfaces. I made it work for modulo prime counting in under 20 minutes, it’s like 30 lines of code.

Floating Point Numbers

Notice that floating point numbers don’t form a field, because 0.3+(0.2+0.1) is not equal to (0.3+0.2)+0.1, i.e. it’s not associative (– thanks to Martin Hořeňovský for this correction, previously I said “commutative”). Actually, Ganak2’s weighted model counting doesn’t currently support floating point numbers, because it doesn’t need to — we won literally every available track with infinite rational numbers: 0.3 is simply interpreted as 3/10. This sounds minor, but it also means that it’s actually trivial to check if Ganak runs correctly — it simply needs to be run with a different configuration and it should produce the same solution. Notice that this is not at all true for any other weighted model counter. Literally all of them, except for Ganak, will give different solutions if ran with different seeds. Which is hilarious, since there is obviously only one correct solution. But since 0.3+(0.2+0.1) is not equal to (0.3+0.2)+0.1 in floating point, they can’t do anything… rational.

This whole floating point saga is actually quite annoying, as it also means we can’t check Ganak against any other counter — at least not easily. All other counters give wrong answers, given that floating point is incorrect, and so we can only compare Ganak2 to other counters if we allow a certain deviation. Which is pretty hilarious, given that counters all claim to use “infinite precision”. In fact, the only thing infinite about their results is likely the error. Since (0.3+(0.2+0.1)) – ((0.3+0.2)+0.1) is not 0, it is actually possible to have theoretically infinite error in case of negative weights.

Approximate Counting and Ganak

Ganak2 incorporates ApproxMC, and can seamlessly transition into counting via approximate model counting methods. To do this, simply pass the “–appmct T” flag, where T is the number of seconds after which Ganak will start counting in approximate mode. This can only be done for unweighted counting, as ApproxMC can only do unweighted counting. However, this seamless transition is very interesting to watch, and demonstrates that using a multi-modal counting framework, i.e. exact+approximate is a very viable strategy.

What happens under the hood is currently unpublished, but basically, we stop the exact counting, check how much we have counted so far, subtract that from the formula, and pass it to ApproxMC. We then get the result from ApproxMC and add the partial count that Ganak counted, and display the result to the user. So the final count is partially approximate, and partially exact, so we actually give better approximation (epsilon and delta) guarantees than what we promise.

Compiling Ganak

The new Ganak incorporates 8 libraries: GNU MP, CryptoMiniSat, Arjun, ApproxMC, SBVA, CadiCal, CadiBack, and BreakID. Furthermore, it and compiles in (optinally) BuDDy, Oracle (from Korhonen and Jarvisalo) and Flowcutter. These are all individually mentioned and printed to the console when running. However, this means that building Ganak is not trivial. Hence, with the very generous help of Noa Aarts, Ganak has a flake nix, so you can simply:

git clone https://github.com/meelgroup/ganak
cd ganak
nix-shell

And it will reproducibly build Ganak and make it available in the path. All this needs is for one to have installed Nix, which is a single-liner and works incredibly reliably. Otherwise, the Ganak repository can be cloned together with its GitHub Actions, and then each push to the repository will build Ganak2 for 4 different architectures: Linux x86&ARM, Mac x86&ARM.

Submit your Version of Ganak to the 2025 Competition

Ganak2 is highly extensible/modifiable. I strongly encourage you to extend/change/improve it and submit your improvement to the Model Counting Competition of 2025. Deadline is in 3 weeks, it’s time to get your idea into Ganak and win. I promise I will have all my changes publicly available until then, and you can literally submit the same thing I will, if you like. I’d prefer if you put in some cool change, though.

On SAT/SMT Model Counting

I have recently had a few discussions with people who wanted to do some form of model counting, and I decided to put my thoughts about this problem into a single blog post, hopefully bringing some light to this area. Firstly, when talking about model counting, it’s very important to think through what one actually needs to achieve. When people talk about a model, they often mean a solution to a problem, described as a set of constraints over either purely boolean variables (so-called propositional model counting) expressed in so-called CNF, or over higher-level constraints, often expressed in SMT-LIB format.

Let’s look at a CNF example:

c this is a comment
c below line means there are 5 boolean variables and 2 constraints
p cnf 5 2
1 -2 0
1 -2 3 0

The above CNF has 5 boolean variables and 2 constraints: “v1 or (not v2) = true”, and “v1 or (not v2) or v3 = true”. In this case, if “v1 or (not v2) = true” is satisfied, the other constraint is also satisfied. So we effectively only have to satisfy the “v1 or -v2” constraint. There are 3 solutions over the variables v1 and v2 to this constraint: (1, 0), (1, 1), and (0,0). For variables v3, v4, and v5 we can choose anything. So there are a total of 3*2^3 = 24 solutions to this CNF.

Let’s look at an SMT problem:

(set-logic QF_BV) ; Set the logic to Quantifier-Free BitVectors
(declare-fun bv () (_ BitVec 8)) ; Declare an 8-bit value
(assert (not (= bv #x00))) ; Assert that 'bv' is not equal to zero

In this SMT-LIB problem, we declared that the logic we will be using is the so-called quantifier-free bitvector logic, which basically gives you access to fixed-width variables such as uint8_t, and of course all the logical and fixed-bit bitwector operators such as modulo addition, division, subtraction, etc. In this particular case, we declared an 8-bit variable and banned it from being 0, so this problem has 2^8-1 solutions, i.e. 255.

Whichever format is being used, “counting” of solutions is something that may sound trivial (it’s just a number!), but once we get a bit more fiddly, it turns out that we can do many things that are actually easier, or even harder, than just counting, even though they sound very close to, or are even synonymous with, counting :

  • Figuring out if there is at least one solution. This is more appropriately called satisfiability checking. This is significantly easier than any notion of counting listed below. However, even this can be challenging. Typically, in the SMT/SAT competition, this is the so-called SMT single query track or the SAT competition Main Track. All SAT and SMT solvers such as kissat or cvc5 can run this kind of task.
  • Next up in complexity is to figure out if there is exactly one solution. This is actually very close to satisfiability checking, and requires only two so-called “incremental calls” to the SMT/SAT solver. I have seen this question pop up a lot in zero-knowledge (ZK) circuit correctness. What is needed is to run the query as per the above, but then add a constraint that bans the solution that is output by the solver, and ask the system to be solved again. If the solver outputs UNSAT, i.e. unsatisfiable, then we know there are no more solutions, and we are happy with the result. If not, it gives us another solution, which is a witness that there are at least two solutions. These queries are answered by so-called incremental SMT solvers (SMT competition track here), or incremental SAT solvers (incremental SAT competition track here). Almost all SAT and SMT solvers can work incrementally, for example cvc5 or z3. Most SAT solvers work incrementally, too, but e.g. kissat does not, and you need to use CaDiCaL instead.
  • Next up in complexity is to approximately count the solutions. Here, we are not interested that there are exactly, say, 87231214 solutions, instead, we are happy to know that there are approx 2^26 solutions. This is often good enough, in case one wants to e.g. figure out probabilities. There are very few such systems available, for SMT there is csb, and for SAT there is ApproxMC. For the special case of approximate DNF counting, I’d recommend pepin.
  • Next up in the complexity is exactly counting the solutions. Here, we want to know how many solutions there are, exactly, and we really-really need the exact number. There are few solutions to doing this over SMT, but doing this over SAT is quite developed. For SMT, I would recommend following the csb approach, and running Ganak on the blasted SAT formula instead of ApproxMC, and for SAT, there is the annual model counting competition, where Ganak won every single category this year, so I’d recommend Ganak.
  • Next up is model enumeration where it’s not enough to count the solutions, but we need a way to compactly represent and store them all. This of course requires quite a bit of disk space, as there could be e.g. 2^100 solutions, and representing them all, even in a very compact format with lots of smart ideas can take a lot of space — but not necessarily exponential amount. Currently, there is no such system for SMT, and my understanding is that only the d4 model counter can do this on SAT/CNF problems, thanks to its proof checking system, in part thanks to Randal Byrant. Once Ganak can output a proof, it will also be able to do this.

All in all, there is a very large gap between “find if there is exactly one solution” and “enumerate all solutions” — even though I have seen these two notions being mixed up a few times. The first might take only a few seconds to solve, while the other might not even be feasible, because compactly representing all solutions may be too expensive in terms of space (and that’s not even discussing time).

One aspect to keep in mind is that often, one does not need all solutions. For example, if one only wants to find a good set of example solutions, then it is possible to use uniform-like samplers, that give a “somewhat uniform” set of example solutions from the solution space (without guarantees of uniformity). A good tool for this is cmsgen for CNF/SAT problems. If one needs a guaranteed uniform set of samples (to be exact, a probabilistically approximately uniform sample), then unigen is the way to go. Obviously, it’s quite easy to find a uniform sample once we have an enumeration of all solutions — simply pick a random one from the whole set — but that’s way more expensive than running unigen.

While all of the above is about counting, it’s often the case that one wants to optimize for something within the solution space. Basically, we have some kind of objective function that we want to optimize for (e.g. maximize profit, minimize loss) , and so we are looking for a solution that maximizes/minimizes some property. Of course, once we have done model enumeration, it’s quite easy to do this — simply run the objective function on all solutions, and pick the best! But that’s really hard. Instead, one can formulate the problem as a so-called MaxSAT or an SMT problem with an objective function. These systems run some really smart algorithms to significantly cut down search time, and even often keep a “best so far” solution around, so we can early-abort them and still get some kind of “best so far” solution.

All in all, it’s extremely important to figure out what one actually needs, because there are vast gaps in complexity between the problems (and corresponding algorithms) above. Even between something like approximate and exact counting, the gap is gigantic, as there are huge swaths of problems that are almost trivial to count approximately, but essentially impossible to exactly count with current technology:

Here, the green line is a tool that changes its behaviour from exact to approximate model counting after 1800 seconds of solving, and the number of solved instances jumps significantly. Notice that the red line, the original system it runs for 1800s, is the best in class here, by a wide margin, and still the approximate counting that Ganak+ApprocMC switches into shows a massive difference.

Our tools for solving, counting and sampling

This post is just a bit of a recap of what we have developed over the years as part of our toolset of SAT solvers, counters, and samplers. Many of these tools depend on each other, and have taken greatly from other tools, papers, and ideas. These dependencies are too long to list here, but the list is long, probably starting somewhere around the Greek period, and goes all the way to recent work such as SharpSAT-td or B+E. My personal work stretches back to the beginning of CryptoMiniSat in 2009, and the last addition to our list is Pepin.

Overview

Firstly when I say “we” I loosely refer to the work of my colleagues and myself, often but not always part of the research group lead by Prof Kuldeep Meel. Secondly, almost all these tools depend on CryptoMiniSat, a SAT solver that I have been writing since around 2009. This is because most of these tools use DIMACS CNF as the input format and/or make use of a SAT solver, and CryptoMiniSat is excellent at reading, transforming , and solving CNFs. Thirdly, many of these tools have python interface, some connected to PySAT. Finally, all these tools are maintained by me personally, and all have a static Linux executable as part of their release, but many have a MacOS binary, and some even a Windows binary. All of them build with open source toolchains using open source libraries, and all of them are either MIT licensed or GPL licensed. There are no stale issues in their respective GitHub repositories, and most of them are fuzzed.

CryptoMiniSat

CryptoMiniSat (research paper) our SAT solver that can solve and pre- and inprocess CNFs. It is currently approx 30k+ lines of code, with a large amount of codebase dedicated to CNF transformations, which are also called “inprocessing” steps. These transformations are accessible to the outside via an API that many of the other tools take advantage of. CryptoMiniSat used to be a state-of-the-art SAT solver, and while it’s not too shabby even now, it hasn’t had the chance to shine at a SAT competition since 2020, when it came 3rd place. It’s hard to keep SAT solver competitive, there are many aspects to such an endeavor, but mostly it’s energy and time, some of which I have lately redirected into other projects, see below. Nevertheless, it’s a cornerstone of many of our tools, and e.g. large portions of ApproxMC and Arjun are in fact implemented in CryptoMiniSat, so that improvement in one tool can benefit all other tools.

Arjun

Arjun (research paper) is our tool to make CNFs easier to count with ApproxMC, our approximate counter. Arjun takes a CNF with or without a projection set, and computes a small projection set for it. What this means is that if say the question was: “How many solutions does this CNF has if we only count solutions to be distinct over variables v4, v5, and v6?”, Arjun can compute that in fact it’s sufficient to e.g. compute the solutions over variables v4 and v5, and that will be the same as the solutions over v4, v5, and v6. This can make a huge difference for large CNFs where e.g. the original projection set can be 100k variables, but Arjun can compute a projection set sometimes as small as a few hundred. Hence, Arjun is used as a preprocessor for our model counters ApproxMC and GANAK.

ApproxMC

ApproxMC (research paper) is our probabilistically approximate model counter for CNFs. This means that when e.g. ApproxMC gives a result, it gives it in a form of “The model count is between 0.9*M and 1.1*M, with a probability of 99%, and with a probability of 1%, it can be any value”. Which is very often enough for most cases of counting, and is much easier to compute than an exact count. It counts by basically halfing the solution space K times and then counts the remaining number of solutions. Then, the count is estimated to be 2^(how many times we halved)*(how many solutions remained). This halfing is done using XOR constraints, which CryptoMiniSat is very efficient at. In fact, no other state-of-the-art SAT solver can currently perform XOR reasoning other than CryptoMiniSat.

UniGen

UniGen (research paper) is an approximate probabilistic uniform sample generator for CNFs. Basically, it generates samples that are probabilistically approximately uniform. This can be hepful for example if you want to generate test cases for a problem, and you need the samples to be almost uniform. It uses ApproxMC to first count and then the same idea as ApproxMC to sample: add as many XORs as needed to half the solution space, and then take K random elements from the remaining (small) set of solutions. These will be the samples returned. Notice that UniGen depends on ApproxMC for counting, Arjun for projection minimization, and CryptoMiniSat for the heavy-lifting of solution/UNSAT finding.

GANAK

GANAK (research paper, binary) is our probabilistic exact model counter. In other words, it returns a solution such as “This CNF has 847365 solutions, with a probability of 99.99%, and with 0.01% probability, any other value”. GANAK is based on SharpSAT and some parts of SharpSAT-td and GPMC. In its currently released form, it is in its infancy, and while usable, it needs e.g. Arjun to be ran on the CNF before, and while competitive, its ease-of-use could be improved. Vast improvements are in the works, though, and hopefully things will be better for the next Model Counting Competition.

CMSGen

CMSGen (research paper) is our fast, weighted, uniform-like sampler, which means it tries to give uniform samples the best it can, but it provides no guarantees for its correctness. While it provides no guarantees, it is surprisingly good at generating uniform samples. While these samples cannot be trusted in scenarios where the samples must be uniform, they are very effective in scenarios where a less-than-uniform sample will only degrade the performance of a system. For example, they are great at refining machine learning models, where the samples are taken uniformly at random from the area of input where the ML model performs poorly, to further train (i.e. refine) the model on inputs where it is performing poorly. Here, if the sample is not uniform, it will only slow down the learning, but not make it incorrect. However, generating provably uniform samples in such scenarios may be prohibitively expensive. CMSGen is derived from CryptoMiniSat, but does not import it as a library.

Bosphorus

Bosphorus (research paper) is our ANF solver, where ANF stands for Algebraic Normal Form. It’s a format used widely in cryptography to describe constraints over a finite field via multivariate polynomials over a the field of GF(2). Essentially, it’s equations such as “a XOR b XOR (b AND c) XOR true = false” where a,b,c are booleans. These allow some problems to be expressed in a very compact way and solving them can often be tantamount to breaking a cryptographic primitive such as a symmetric cipher. Bosphorus takes such a set of polynomials as input and either tries to simplify them via a set of inprocessing steps and SAT solving, and/or tries to solve them via translation to a SAT problem. It can output an equivalent CNF, too, that can e.g. be counted via GANAK, which will give the count of solutions to the original ANF. In this sense, Bosphorus is a bridge from ANF into our set of CNF tools above, allowing cryptographers to make use of the wide array of tools we have developed for solving, counting, and sampling CNFs.

Pepin

Pepin (research paper) is our probabilistically approximate DNF counter. DNF is basically the reverse of CNF — it’s trivial to ascertain if there is a solution, but it’s very hard to know if all solutions are present. However, it is actually extremely fast to probabilistically approximate how many solutions a DNF has. Pepin does exactly that. It’s one of the very few tools we have that doesn’t depend on CryptoMiniSat, as it deals with DNFs, and not CNFs. It basically blows all other such approximate counters out of the water, and of course its speed is basically incomparable to that of exact counters. If you need to count a DNF formula, and you don’t need an exact result, Pepin is a great tool of choice.

Conclusions

My personal philosophy has been that if a tool is not easily accessible (e.g. having to email the authors) and has no support, it essentially doesn’t exist. Hence, I try my best to keep the tools I feel responsible for accessible and well-supported. In fact, this runs so deep, that e.g. CryptoMiniSat uses the symmetry breaking tool BreakID, and so I made that tool into a robust library, which is now being packaged by Fedora, because it’s needed by CryptoMiniSat. In other words, I am pulling other people’s tools into the “maintained and supported” list of projects that I work with, because I want to make use of them (e.g. BreakID now builds on Linux, MacOS, and Windows). I did the same with e.g. the Louvain Community library, which had a few oddities/issues I wanted to fix.

Another oddity of mine is that I try my best to make our tools make sense to the user, work as intended, give meaningful (error) messages, and good help pages. For example, none of the tools I develop call subprocesses that make it hard to stop a computation, and none use a random number seed that can lead to reproducibility issues. While I am aware that working tools are sometimes less respected than a highly cited research paper, and so in some sense I am investing my time in a slightly suboptimal way, I still feel obliged to make sure the tax money spent on my academic salary gives something tangible back to the people who pay for it.

How Approximate Model Counting Works

Approximate model counting allows to count the number of solutions (or “models”) to propositional satisfiability problems. This problem seems trivial at first given a propositional solver that can find a single solution: find one solution, ban it, ask for another one, ban it, etc. until all solutions are counted. The issue is that sometimes, the number of solutions is 2^50 and so counting this way is too slow. There are about 2^266 atoms in the universe, so counting anywhere near that is impossible using this method.

Exact Counting

Since we cannot count 1-by-1, we are then left with trying to count in some smarter way. There are a bunch of methods to count exactly, the simplest is to cut the problem on a variable, count when the variable is True, count when the variable is False, recursively, and add them all up. Given caching of components that recur while “cutting” away, this can be quite successful, as implemented by sharpSAT (see Marc Thurley’s paper).

These counters can scale quite well, but have some downsides. In particular, when the memory runs out, the cache needs to be groomed, sometimes resorting back to 1-by-1 counting, which we know will fail as there is no way 2^200 can be counted 1-by-1 in any reasonable amount of time. The caching systems used are smart, though, retaining last-used entries when the cache needs to be groomed. However, sometimes this grooming algorithm can lead to cyclic behaviour that effectively simulates 1-by-1 counting.

Approximate Counting

What Chakraborty, Meel and Vardi did in their paper, was to create a counter that counts not exactly, but “probably approximately correctly”. This jumble of terms basically means: there is a certain probability that the counting is correct, within a given threshold. We can both improve the probability and the threshold given more CPU time spent. In practical terms, the probability can be set to be over 99.99% and the threshold can be set to be under 20%, still beating exact counters. Note that 20% is not so much, as e.g. 2^60*1.2 = 2^(60.3).

A Galton Box

So what’s the trick? How can we approximately count and give guarantees about the count? The trick is in fact quite simple. Let’s say you have to count balls, and there are thousands of them. One way to do it is to count 1-by-1. But, if you have a machine that can approximately half the number of balls you have, it can be done a lot faster: you half the balls, then check if you have at least 5 doing 1-by-1 counting. If you do, you half them again, and check if you have at least 5, etc. Eventually, let’s say you halved it 11 times and now you are left with 3 balls. So approximately, you must have had 3*2^11 = 6144 balls to begin with. In the end you had to execute the 1-by-1 count only 11*5+3+1 =59 times — a lot less than 6144! This is the idea used by ApproxMC.

Approximately Halving Using XORs

The “approximate halving” function used by ApproxMC is the plain XOR function, populated with variables picked with 50% probability. So for example if we have variables v1…v10, we pick each variable with 50% probability and add them into the same XOR. Let’s say we picked v1,v2,v5 and v8. The XOR would then be: v1⊕v2⊕v5⊕v8=1. This XOR is satisfied if an odd number of variables from the set {v1,v2,v5,v8} are 1. This intuitively forbids about half the solutions. The “intuitively” part of course is not enough, and if you read the original paper you will find the rigorous mathematical proof for the approximate halving of solutions this XOR function provides.

OK, so all we need to do is add these XORs to our original problem and we are done! This sounds easy, but there is a small hurdle and there is a big hurdle associated with this.

The small hurdle is that the original problem is a CNF, i.e. a conjunction of disjunctions, looking like “(v1 OR v2) AND (v2 OR not v3 OR not v4) AND…”. The XOR obviously does not look like this. The straightforward translation of XOR into CNF is exponential, so we need to add some variables to cut them smaller. It’s not that hard to figure this out and eventually add all the XORs into the CNF.

XORs, CDCL, and Gauss-Jordan Elimination

The larger hurdle is that once the XORs are in the CNF using the translation, the CNF becomes exponentially hard to solve using standard CDCL as used in most SAT solvers. This is because Gauss-Jordan elimination is exponentially hard for the standard CDCL to perform — but we need Gauss-Jordan elimination because the XORs will interact with each other, as we expect there to be many of them. Without being able to resolve the XORs with each other efficiently and derive information from them, it will be practically impossible to solve the CNF.

The solution is to tightly integrate Gauss-Jordan elimination into the solving process. CryptoMiniSat was the first solver to do this tight integration (albeit only for Gaussian elimination, which is sufficient). Other solvers have followed, in particular the work by Tero Laitinen and work by Cheng-Shen Han and Jie-Hong Roland Jiang. CryptoMiniSat currently uses the code by Han and Jiang after some cleanup and updates.

The latest work on ApproxMC and CryptoMiniSat has added one more thing besides the tight integration of the CDCL cycle: it now allows in- and pre-processing to occur while the XORs are inside the system. This brought some serious speedups as pre- and inprocessing are important factors in SAT solving. In fact, all modern SAT solvers strongly depend on them being active and working.

Concluding Remarks

We have gone from what model counting is, through how approximate counting works from a high-level perspective, all the way to the detail of running such a system inside a modern SAT solver. In case you want to try it out, you can do it by downloading the pre-built binaries or building it from source.