z3 much slower than ortools SAT. Why?

Question

I am trying out different solvers for a toy clique problem and was surprised to find that ortools using SAT seems much faster than z3. I am wondering if I am doing something wrong given that z3 does so well in the published benchmarks. Here is an MWE:

First create a random graph with 150 vertices:

# First make a random graph and find the largest clique size in it
import igraph as ig
import random
from tqdm import tqdm
random.seed(7)
num_variables = 150  # bigger gives larger running time gap between z3 and ortools grow
print("Making graph")
g = ig.Graph.Erdos_Renyi(num_variables, 0.6)

# Make a set of edges. Maybe this isn't necessary
print("Making set of edges")
edges = set()
for edge in tqdm(g.es):
    edges.add((edge.source, edge.target))

Now use z3 to find the max clique size:

import z3
z3.set_option(verbose=1)
myVars = []
for i in range(num_variables):
    myVars += [z3.Int('v%d' % i)]

opt = z3.Optimize()

for i in range(num_variables):
    opt.add(z3.Or(myVars[i]==0, myVars[i] == 1))

for i in tqdm(range(num_variables)):
    for j in range(i+1, num_variables):
        if not (i, j) in edges:
            opt.add(myVars[i] + myVars[j] <= 1)

t = time()
h = opt.maximize(sum(myVars))
opt.check()
print(round(time()-t,2))

This takes around 70 seconds on my PC.

Now do the same thing using SAT from ortools.

from ortools.linear_solver import pywraplp
solver = pywraplp.Solver.CreateSolver('SAT')
solver.EnableOutput()

myVars = []

for i in range(num_variables):
        myVars += [solver.IntVar(0.0, 1.0, 'v%d' % i)]

for i in tqdm(range(num_variables)):
    for j in range(i+1, num_variables):
        if not (i, j) in edges:
            solver.Add(myVars[i] + myVars[j] <= 1)

print("Solving")
solver.Maximize(sum(myVars))
t = time()
status = solver.Solve()
print(round(time()-t,2))

This takes about 4 seconds on my PC.

If you increase num_variables the gap grows even larger.

Am I doing something wrong or is this just a really bad case for the z3 optimizer?

Update

It turns out that ortools using SAT is multi-core by default using 8 cores so the timings are unfair. Using @AxelKemper's improvement I now get (on a different machine):

Z3 time 88 seconds. If we assume perfect parallelisation that is 11 seconds on 8 cores.
ortools 4.3 seconds

So Z3 is only 2.5 times slower than ortools.

Update 2

Using @alias's improved code that uses s.add(z3.PbGe([(x, 1) for x in myVars], k)) it now takes 30.4 seconds which when divided by 8 is faster than ortools!

Is `z3.Or(myVars[i]==0, myVars[i] == 1)` really the right way to make an int variable that's either 0 or 1? I note that your other program is able to specify the bounds directly, which may help it infer bounds for the arithmetic constraints. — kaya3, Apr 05 '22 at 10:39
@kaya3 I don't know. I tried a version `z3.And(myVars[i] >=0, myVars[i] <=1)` but it made no difference. — Simd, Apr 05 '22 at 10:44
Just a comment, roughly speaking, z3 and CP-SAT uses the same basic technology (called LCG in the Constraint Programming community). Now, CP-SAT also works as a MIP solver, and embeds a simplex inside. Maybe this helps. — Laurent Perron, Apr 05 '22 at 10:48
I disagree with the faster than or-tools. You can say more energy efficient that or-tools, but faster, no. Parallel SAT search is an open problem. So parallel SMT is an open problem. — Laurent Perron, Apr 06 '22 at 13:47
@LaurentPerron That's a fair point. Also Z3 is l sadly much slower again if you just increase 150 to 200. — Simd, Apr 06 '22 at 13:58

score 1 · Answer 1 · answered Apr 05 '22 at 11:56

1

Replacing the Int variables by Bool variables reduced the z3py runtime by 28% on my machine:

z3.set_option(verbose=1)
myVars = []
for i in range(num_variables):
    myVars += [z3.Bool('v%d' % i)]   #  Bool rather than Int

opt = z3.Optimize()
   
for i in tqdm(range(num_variables)):
    for j in range(i+1, num_variables):
        if not (i, j) in edges:
            opt.add(Not(And(myVars[i], myVars[j])))

t = time()
h = opt.maximize(Sum([If(myVars[i], 1, 0) for i in range(num_variables)]))
opt.check()
print(round(time()-t,2))

answered Apr 05 '22 at 11:56

Axel Kemper

10,544
2
31
54

This is a very nice improvement! Do you know how to make z3 multi-core? – Simd Apr 05 '22 at 12:22
Sadly, `set_option("parallel.enable", True)` does not make any difference here. – Axel Kemper Apr 05 '22 at 12:45
I noticed that too. I assume the solver hasn't been implemented to be multi-core. – Simd Apr 05 '22 at 12:46
also CP-SAT does not implement work sharing, so the notion of perfect parallelism is moot :-) – Laurent Perron Apr 05 '22 at 12:50
@LaurentPerron Sorry I don't quite get that. CP-SAT is certainly faster if you use more cores. Maybe I don't know what work sharing is? Do you mean I could have done the same experiment using CP-SAT instead and set `solver.parameters.num_search_workers = 1`? – Simd Apr 05 '22 at 12:53
@LaurentPerron Or did you mean that CP-SAT always has perfect parallelism? – Simd Apr 05 '22 at 13:08
1

No. It does not share work. It uses a portfolio of workers with different heuristics. – Laurent Perron Apr 05 '22 at 13:25
1

Cp-sat is not tuned to work with only 1 worker. There are usually huge gains using 8 workers, some more gain going to 16, and marginally better solutions, faster above, but no faster proof of optimality. – Laurent Perron Apr 05 '22 at 17:08

alias · Accepted Answer · 2022-04-06T14:29:49.000

1

A custom tool like ortools is usually hard to beat, as it understands only a fixed number of domains, and they take advantage of parallel hardware. An SMT solver shines not at speed, but rather at what it allows: Combination of many many theories (arithmetic, data-structures, booleans, reals, integers, floats, etc.), so it's not a fair comparison.

Having said that, we can make z3 go faster by using three ideas:

Use Bool instead of Int as Axel suggested. z3 is much better dealing with booleans as is, instead of coding them via integers. See this earlier answer for general advice on coding in z3.
z3's regular solver is much more adept at handling many constraints compared to the optimizer. While the optimizer definitely has its applications, avoid it if you can. If your problem can use an iterative approach (i.e., get a solution and keep improving by adding new constraints), it's definitely worth trying. Not all optimization problems are amenable to this sort of iteration based optimization of course, but whenever applicable, it can pay off. (One reason for this is that the optimization is a much more difficult problem that can get the solver bogged down in irrelevant details. Second reason is that z3's optimizer hasn't received much attention in recent years compared to the solver engines, so it's not keeping up with the improvements that were done to the mainline solver. At least this is my impression.)
Use pseudo-boolean equalities, for which z3 has internal tactics to deal with much more efficiently. For pseudo-boolean constraints, see this answer.

Putting all these ideas together, here's how I'd code your problem in z3:

import igraph as ig
import random
from time import *
import z3

# First make a random graph and find the largest clique size in it
from tqdm import tqdm
random.seed(7)
num_variables = 150  # bigger gives larger running time gap between z3 and ortools grow
print("Making graph")
g = ig.Graph.Erdos_Renyi(num_variables, 0.6)

# Make a set of edges. Maybe this isn't necessary
print("Making set of edges")
edges = set()
for edge in tqdm(g.es):
    edges.add((edge.source, edge.target))

myVars = []
for i in range(num_variables):
    myVars += [z3.Bool('v%d' % i)]

total = sum([z3.If(i, 1, 0) for i in myVars])

s = z3.Solver()
for i in tqdm(range(num_variables)):
    for j in range(i+1, num_variables):
        if not (i, j) in edges:
            s.add(z3.Not(z3.And(myVars[i], myVars[j])))

t = time()
curTotal = 0
while True:
  s.add(z3.PbGe([(x, 1) for x in myVars], curTotal+1))
  r = s.check()
  if r == z3.unsat:
      break
  curTotal = s.model().evaluate(total, model_completion=True).as_long()
  print(r, curTotal, round(time()-t,2))

print("DONE total time =", round(time()-t,2))

Note the use of z3.PbGe which introduces the pseudo-boolean constraints. In my experiments this runs faster than the regular constraint version; and if you also assume 8-core full parallelization speed-up, I think it compares favorably to the ortools solution.

Note that the iterative loop above is coded in a linear fashion, i.e., we ask for curTotal+1 solution, incrementing the last solution found by z3 only by one in each call to check. At the expense of some extra programming, you can also implement a binary-search like routine, i.e., ask for double the last result, and if unsat scale back to find the optimum, using push/pop to manage the assertion stack. This trick can perform even better when amortized over many runs.

Please report the results of your own experiments!

edited Apr 06 '22 at 14:29

answered Apr 05 '22 at 15:10

alias

28,120
2
23
40

Thank you. The 2.5 factor slowdown already takes into account parallelization. Maybe the z3 optimer can be made to output intermediate solutions to avoid having to search explicitly? – Simd Apr 05 '22 at 16:02
1

I don't think z3's optimizer generates any sort of "intermediate" result that you can rely on. For some variables, it keeps bound information, but until a `sat/unsat` decision is obtained, they might be unreliable. See the discussion here: https://stackoverflow.com/q/55363925/936310 – alias Apr 05 '22 at 16:35
Maybe a good question for their github. – Simd Apr 05 '22 at 16:37
@graffe It appears using pseudo-boolean constraints make z3 go much faster. See my updated answer. I'd love to see if you experiment with it and find out what sort of timing results you get compared to ortools. (After factoring for parallelization of course.) – alias Apr 06 '22 at 04:06
1

30 seconds! That's pretty amazing – Simd Apr 06 '22 at 12:49
Sadly if you increase the 150 to 200 z3 grinds to a halt again where ortools find the optimum answer (14) very quickly. Do you see the same thing? – Simd Apr 06 '22 at 13:10
I'm not surprised.. I've put in a little change which, instead of incrementing by one, tries to go faster by trying to find a "better" solution from the last time. Alas, in my small test, it also walked the space one by one. I think experimenting with a binary-search like idea might pay off. (i.e., ask for a solution twice as better, then double it till you get `unsat`, then scale-back to find the optimum.) But with very large graphs, you'll have to concede to custom tools like ortools to come out ahead. I doubt there's much you can do to scale any further out-of-the box. – alias Apr 06 '22 at 13:19
The alternative would be to write a "custom" tactic in z3 to handle these sorts of problems. But finding a maximal clique is an NP-complete problem, and in general you can't expect to beat a custom solver with a generic SMT solver for such a problem. (i.e., at best you'd duplicate the algorithms in ortools in z3, but what's the point of that?) – alias Apr 06 '22 at 13:21
1

I disagree with the notion of custom solver vs generic solver. The APIs for the z3 model and the or-tools models are very similar. So the difference is just the time doing engineering work on the backend. – Laurent Perron Apr 06 '22 at 13:50
1

@Laurent Perron I'm sure z3 folks would appreciate contributions to help it handle problems of this sort faster! It's a finite-human-resource issue though; not sure if there's enough ROI to invest in z3 to do these if there are better tools out there to solve them more efficiently. – alias Apr 06 '22 at 13:52
Curiously your new code doesn't skip any values at all. https://bpa.st/JW7A – Simd Apr 06 '22 at 15:21
yeah I noticed that too. you really want to implement the "binary search" idea to take advantage of skipping values. – alias Apr 06 '22 at 15:56
I am not sure it is worth it. The sum of the times for all but the optimal is much less than the time just for the optimal value. Interestingly your previous code before the PbGe improvement did skip values. – Simd Apr 06 '22 at 16:18
1

Probably not worth it indeed. If you can start from a more reasonable value (instead of 0), that can probably help. This might be possible depending on the problem; perhaps you've some expectation of what the value should be due to prior knowledge of the shape of the graph? But of course, these are tangential aspects of the problem; not directly related to speeding z3 up. – alias Apr 06 '22 at 16:49

z3 much slower than ortools SAT. Why?

2 Answers2

Linked