Skip to content Skip to sidebar Skip to footer

Ord() Function Or Ascii Character Code Of String With Z3 Solver

How can I convert a z3.String to a sequence of ASCII values? For example, here is some code that I thought would check whether the ASCII values of all the characters in the string

Solution 1:

2022 Update

Below answer, written back in 2018, no longer applies; as strings in SMTLib received a major update and thus the code given is outdated. Keeping it here for archival purposes, and in case you happen to have a really old z3 that you cannot upgrade for some reason. See the other answer for a variant that works with the new unicode strings in SMTLib: https://stackoverflow.com/a/70689580/936310

Old Answer from 2018

You're conflating Python strings and Z3 Strings; and unfortunately the two are quite different types.

In Z3py, a String is simply a sequence of 8-bit values. And what you can do with a Z3 is actually quite limited; for instance you cannot iterate over the characters like you did in your add_ascii_values function. See this page for what the allowed functions are: https://rise4fun.com/z3/tutorialcontent/sequences (This page lists the functions in SMTLib parlance; but the equivalent ones are available from the z3py interface.)

There are a few important restrictions/things that you need to keep in mind when working with Z3 sequences and strings:

  • You have to be very explicit about the lengths; In particular, you cannot sum over strings of arbitrary symbolic length. There are a few things you can do without specifying the length explicitly, but these are limited. (Like regex matches, substring extraction etc.)

  • You cannot extract a character out of a string. This is an oversight in my opinion, but SMTLib just has no way of doing so for the time being. Instead, you get a list of length 1. This causes a lot of headaches in programming, but there are workarounds. See below.

  • Anytime you loop over a string/sequence, you have to go up to a fixed bound. There are ways to program so you can cover "all strings upto length N" for some constant "N", but they do get hairy.

Keeping all this in mind, I'd go about coding your example like the following; restricting password to be precisely 10 characters long:

from z3 import *

s = Solver()

# Work around the fact that z3 has no way of giving us an element at an index. Sigh.
ordHelperCounter = 0defOrdAt(inp, i):
    global ordHelperCounter
    v = BitVec("OrdAtHelper_%d_%d" % (i, ordHelperCounter), 8)
    ordHelperCounter += 1
    s.add(Unit(v) == SubString(inp, i, 1))
    return v

# Your original function, but note the addition of len parameter and use of Sumdefadd_ascii_values(password, len):
    return Sum([OrdAt(password, i) for i inrange(len)])

# We'll have to force a constant length
length = 10
password = String("password")
s.add(Length(password) == 10)
ascii_sum = add_ascii_values(password, length)
s.add(ascii_sum == 100)

# Also require characters to be printable so we can view them:for i inrange(length):
  v = OrdAt(password, i)
  s.add(v >= 0x20)
  s.add(v <= 0x7E)

print(s.check())
print(s.model()[password])

The OrdAt function works around the problem of not being able to extract characters. Also note how we use Sum instead of sum, and how all "loops" are of fixed iteration count. I also added constraints to make all the ascii codes printable for convenience.

When you run this, you get:

sat
":X|@`y}@@@"

Let's check it's indeed good:

>>>len(":X|@`y}@@@")
10
>>>sum(ord(character) for character in":X|@`y}@@@")
868

So, we did get a length 10 string; but how come the ord's don't sum up to 100? Now, you have to remember sequences are composed of 8-bit values, and thus the arithmetic is done modulo 256. So, the sum actually is:

>>>sum(ord(character) for character in":X|@`y}@@@") % 256
100

To avoid the overflows, you can either use larger bit-vectors, or more simply use Z3's unbounded Integer type Int. To do so, use the BV2Int function, by simply changing add_ascii_values to:

defadd_ascii_values(password, len):
    return Sum([BV2Int(OrdAt(password, i)) for i inrange(len)])

Now we'd get:

unsat

That's because each of our characters has at least value 0x20 and we wanted 10 characters; so there's no way to make them all sum up to 100. And z3 is precisely telling us that. If you increase your sum goal to something more reasonable, you'd start getting proper values.

Programming with z3py is different than regular programming with Python, and z3 String objects are quite different than those of Python itself. Note that the sequence/string logic isn't even standardized yet by the SMTLib folks, so things can change. (In particular, I'm hoping they'll add functionality for extracting elements at an index!).

Having said all this, going over the https://rise4fun.com/z3/tutorialcontent/sequences would be a good start to get familiar with them, and feel free to ask further questions.

Solution 2:

The accepted answer dates back to 2018, and things have changed in the mean time which makes the proposed solution no longer work with z3. In particular:

  • Strings are now formalized by SMTLib. (See https://smtlib.cs.uiowa.edu/theories-UnicodeStrings.shtml)
  • Unlike the previous version (where strings were simply sequences of bit vectors), strings are now sequences unicode characters. So, the coding used in the previous answer no longer applies.

Based on this, the following would be how this problem would be coded, assuming a password of length 3:

from z3 import *

s = Solver()

# Ord of character at position idefOrdAt(inp, i):
    return StrToCode(SubString(inp, i, 1))

# Adding ascii values for a string of a given lengthdefadd_ascii_values(password, len):
    return Sum([OrdAt(password, i) for i inrange(len)])

# We'll have to force a constant length
length = 3
password = String("password")
s.add(Length(password) == length)
ascii_sum = add_ascii_values(password, length)
s.add(ascii_sum == 100)

# Also require characters to be printable so we can view them:for i inrange(length):
  v = OrdAt(password, i)
  s.add(v >= 0x20)
  s.add(v <= 0x7E)

print(s.check())
print(s.model()[password])

Note Due to https://github.com/Z3Prover/z3/issues/5773, to be able to run the above, you need a version of z3 that you downloaded on Jan 12, 2022 or afterwards! As of this date, none of the released versions of z3 contain the functions used in this answer.

When run, the above prints:

sat
" #!"

You can check that it satisfies the given constraint, i.e., the ord of characters add up to 100:

>>>sum(ord(c) for c in" #!")
100

Note that we no longer have to worry about modular arithmetic, since OrdAt returns an actual integer, not a bit-vector.

Post a Comment for "Ord() Function Or Ascii Character Code Of String With Z3 Solver"