0

Following to the question published in How expressive can we be with arrays in Z3(Py)? An example, I expressed the following formula in Z3Py:

Exists i::Integer s.t. (0<=i<|arr|) & (avg(arr)+t<arr[i])

This means: whether there is a position i::0<i<|arr| in the array whose value a[i] is greater than the average of the array avg(arr) plus a given threshold t.

The solution in Z3Py:

t = Int('t')
avg_arr = Int('avg_arr')
len_arr = Int('len_arr')

arr = Array('arr', IntSort(), IntSort())

phi_1 = And(0 <= i, i< len_arr)
phi_2 = (t+avg_arr<arr[i])

phi = Exists(i, And(phi_1, phi_2))

s = Solver()
s.add(phi)
print(s.check())
print(s.model())

Note that, (1) the formula is satisfiable and (2) each time I execute it, I get a different model. For instance, I just got: [avg_a = 0, t = 7718, len_arr = 1, arr = K(Int, 7719)].

I have three questions now:

  1. What does arr = K(Int, 7719)] mean? Does this mean the array contains one Int element with value 7719? In that case, what does the K mean?
  2. Of course, this implementation is wrong in the sense that the average and length values are independent from the array itself. How can I implement simple avg and len functions?
  3. Where is the i index in the model given by the solver?

Also, in which sense would this implementation be different using sequences instead of arrays?

Theo Deep
  • 666
  • 4
  • 15
  • 1
    Two immediate issues: (1) `phi_0` is defined but not used. Intentional? (2) there's reference to `avg_a` but it isn't declared. Is that supposed to be `avg_arr`? Without a clear question, it's hard to give a clear answer! – alias Sep 20 '22 at 16:52
  • 1
    Also note that since you used `Exists(i, ...)` the top-level declaration for `i` is essentially unused. (It's a simple place-holder for the existential. So you don't get a model value for it.) – alias Sep 20 '22 at 16:54
  • You are totally right! (1) I forgot to add `phi_0` to the `s.add`, sorry. I will erase it so that the given model stays the same. (2) `avg_arr` is just a renaming typo, yep. As for the `i`, I thought that it was necessary to declare as part of the whole Integers, I see my fault! Corrected! – Theo Deep Sep 20 '22 at 18:03

1 Answers1

1

(1) arr = K(Int, 7719) means that it's a constant array. That is, at every location it has the value 7719. Note that this is truly "at every location," i.e., at every integer value. There's no "size" of the array in SMTLib parlance. For that, use sequences.

(2) Indeed, your average/length etc are not related at all to the array. There are ways of modeling this using quantifiers, but I'd recommend staying away from that. They are brittle, hard to code and maintain, and furthermore any interesting theorem you want to prove will get an unknown as answer.

(3) The i you declared and the i you used as the existential is completely independent of each other. (Latter is just a trick so z3 can recognize it as a value.) But I guess you removed that now.

The proper way to model such problems is using sequences. (Although, you shouldn't expect much proof performance there either.) Start here: https://microsoft.github.io/z3guide/docs/theories/Sequences/ and see how much you can push it through. Functions like avg will need a recursive definition most likely, for that you can use RecAddDefinition, for an example see: https://stackoverflow.com/a/68457868/936310

Stack-overflow works the best when you try to code these yourself and ask very specific questions about how to proceed, as opposed to overarching questions. (But you already knew that!) Best of luck..

alias
  • 28,120
  • 2
  • 23
  • 40
  • Understood! I will try this same with sequences and disturb here again some day :) There is only one thing I do not understand, the `i` issue: even if I do not declare it (i.e., it is only in the existential), should not I receive the value of the `i` as part of the model? – Theo Deep Sep 20 '22 at 19:02
  • Also, when you say I should not expect much proof performance with sequences, do you mean that, when I start coding interesting properties (not necessarily the one in the example) I will often get `unknown`? How do you know that, is there any theoretical result on "sequence decidability"? – Theo Deep Sep 20 '22 at 19:05
  • 1
    Regarding `i`: You didn't get a model for it because there were no constraints for it. Note that your `phi_1` is under the quantifier, so the top-level `i` had no constraints for it. z3 typically leaves such variables underspecified. You can ask it to assign a value with the `model_completion` parameter. – alias Sep 20 '22 at 19:56
  • 1
    Performance: Yes; you'll get `unknown` quite likely, or the solver will loop forever. Again, this follows from the fact that these are recursive programs and proofs require induction. which SMT solvers aren't particularly good at. (At least not for the time being.) I'm not aware of any detailed decidability results for sequences; that's a PhD thesis waiting to be written. – alias Sep 20 '22 at 19:58
  • 1
    Here's a good paper to read, though it's 10 years old now and the way SMTLib adopted strings and regexp's is slightly different from this paper: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/nbjorner-smtlibstr.pdf – alias Sep 20 '22 at 20:00
  • Perfect. Thanks again! Need to understand better the SMT vs induction problem. For the moment, I just discovered SMT solvers are not particularly effective neither when they encounter a lot of quantifier alternations... – Theo Deep Sep 20 '22 at 20:17
  • 1
    Here's a good paper to read regarding induction and SMT solvers: http://homepage.divms.uiowa.edu/~ajreynol/vmcai15.pdf – alias Sep 20 '22 at 20:24
  • 1
    I found the following paper on decidability of sequences (ATVA, 2010): https://bugcounting.net/pubs/atva10.pdf I take the opportunity to ask: since arrays are some kind of directed graphs, are there any example of undirected graph datatypes within the SMT-Lib world? – Theo Deep Sep 22 '22 at 12:29
  • 1
    Good find! Note that z3 sequences can be over arbitrary types (you can even have sequences of sequences), while the paper seems to focus on sequences over integers only. I'd still expect the results to more or less hold for sequences in general. – alias Sep 22 '22 at 14:27
  • 1
    Undirected graphs: Nothing out of the box that I know of. But SMTLib allows modeling arbitrary recursive data-types so in theory you can model graphs yourself using that machinery. So, the answer is "yes" so far as they can be modeled as a regular data-type. A custom solver engine can of course be more effective, though I'm not aware of anyone developing one. The closest I can think of are CHCs (Constrained Horn Clauses), which can be interpreted as a special kind of a graph. – alias Sep 22 '22 at 14:29
  • Thanks again. I see you point on sequences. As for graphs, I see I can define them in Z3; however, it will be difficult to prove lemmas about them, since those proofs would involve inductivity, am I right? – Theo Deep Sep 22 '22 at 15:04
  • 1
    Most likely. But you never know unless you try. – alias Sep 22 '22 at 15:05