9

I have got two strings in Python,

A m * B s / (A m + C m)

and

C m * B s / (C m + A m)

that are both equivalent functions of the unordered set (A, C) and the unordered set (B). m and s indicate units that can be swapped among the same but not with another unit.

So far, I'm doing permutations of A, B, and C and testing them using eval and SymPy's == operator. This has multiple drawbacks:

  • for more complicated expressions, I have to generate a large number of permutations (in my case 8 nested for loops)
  • I need to define A, B, C as symbols, which is not optimal when I don't know which parameters I will have (so I have to generate all of them -> terribly inefficient and messing up my variable namespace)

Is there a pythonian way to test for this kind of equivalence? It should work an arbitrary expressions.

D A
  • 3,130
  • 4
  • 25
  • 41
Michael Schubert
  • 2,726
  • 4
  • 27
  • 49
  • 3
    "Pythonians" are from the land of Python. We refer to our coding idioms as "Pythonic". – AJ. May 27 '11 at 16:12
  • I don't know SymPy, but couldn't you generate your permutations [without for loops](http://docs.python.org/library/itertools.html#itertools.permutations)? – senderle May 27 '11 at 16:18
  • Do you have any other examples of the strings in question? I'd lean toward suggesting a regex that would match them - I'll add an answer for the two demonstrated. – g.d.d.c May 27 '11 at 16:20
  • I don't really understand your question(probably about my bad English :) ), but there is a nice function named permutations() in itertools module. – utdemir May 27 '11 at 16:24
  • I am using itertools permutations, but I have to loop through all of the ones I generate. – Michael Schubert May 27 '11 at 16:27
  • AJ: I would refer to the comment as pedantIC. no offense :) – Michael Schubert May 27 '11 at 16:31
  • @user538603, I played around with a canonical mathematical form and I think I have something that's not half bad. If you post some of the test cases you are running this against (especially complicated ones) I'll try to verify it against them and then post the more complete solution. Not sure why, but this problem intrigues me. – Amos Joshua May 29 '11 at 20:27
  • The underlying problem is matching enzymatic rate laws in SBML-encoded models to ones defined in SBO. For some examples, see http://www.ebi.ac.uk/sbo/main/tree.do?open=269&nodeId=1#SBO_269_0 - but I went a bit of a different way with my own solution (not ready yet). – Michael Schubert May 29 '11 at 23:44

5 Answers5

3

Instead of iterating over all possible permutations, assume one exists and attempt to construct it. I believe that done in the right way, failure of the algorithm would imply inexistence of the permutation.

Here is the outline of the idea applied to the expressions above:

let:

str1 = "A m * B s / (A m + C m)"
str2 = "C m * B s / (C m + A m)"

We're looking for a permutation of the set (A, C) that would render the expressions identical. Relabel A and C as X1 and X2 according to the order of their first appearance in str2, so:

X1 = C
X2 = A

because C appears before A in str2. Next, create the array Y such that y[i] is the ith symbol A or C in order of first appearance in str1. So:

Y[1] = A
Y[2] = C

Because A appears before C in str1.

Now construct str3 from str2 by replacing A and C with X1 and X2:

str3 = "X1 m * B s / (X1 m + X2 m)"

And then start substituting Xi for Y[i]. First, X1 becomes Y[1]=A:

str3_1 = "A m * Bs / (A m + X2 m)"

At this stage, compare str3_1 and str1 up to the first occurrence of any of the Xi's, in this case X2, so because these two strings are equal:

str3_1[:18] = "A m * B s / (A m + " 
str1[:18] = "A m * B s / (A m + "

You have a chance of constructing the permutation. If they were unequal, you'd have proven no suitable permutation exists (because any permutation would have had to make at least that substitution) and could deduce inequivalence. But they are equal, so you proceed to the next step, substituting X2 for Y[2]=C:

str3_2 = "A m * B s / (A m + C m)"

And this is equal to str1, so you have your permutation (A->C, C->A) and have shown the equivalence of the expressions.

This is only a demonstration of the algorithm to a particular case, but it should generalize. Not sure what the lowest order you could get it down to is, but it should be quicker than the n! of generating all permutations on n variables.

If I understand the significance of the units correctly, they limit which variables may be swapped for which others by the permutations. So that A can be substituted with C in the above expressions because both have 'm' units, but not with B which has 's' units. You can handle this in the following way:

construct expressions str1_m and str2_m from str1 and str2 by removing all symbols that don't have m units, and then carry out the above algorithm for str1_m and str2_m. If construction fails, no permutation exists. If construction succeeds, keep that permutation (call it the m-permutation) and construct str1_s and str2_s from str1 and str2 by removing all symbols that don't have s units, then carry out the algorithm again for str1_s and str2_s. If construction fails, they are not equivalent. If it succeeds, the final permutation will be a combination of the m-permutation and the s-permutation (although you probably don't even need to construct it, you just care that it exists).

Amos Joshua
  • 1,601
  • 18
  • 25
  • Interesting approach. I'll read through it more thoroughly tomorrow. – Michael Schubert May 27 '11 at 23:02
  • Ok, the main issue I see is that it tests for identical expressions, not mathematically equivalent ones. It won't find 2*X = X+X for instance. – Michael Schubert May 28 '11 at 09:28
  • Is there some canonical form to algebraic expressions that is afforded by sympy? For instance, you could run things through sympy.together, and then try the above algorithm on the resulting string. I put together some code that implements a simplified version of the above algorithm, I'll post it in a bit. – Amos Joshua May 28 '11 at 10:00
3

Here is a simplified approach based on my previous answer.

The idea is that if two expressions are equivalent under permutations, the permutation carrying one to the other must map the ith symbol in the first string (ordered by index of first occurrence) to the ith symbol in the second string (again ordered by index of first occurrence). This principle can be used to construct a permutation, apply it to the first string and then check for equality with the second string - if they are equal they are equivalent, otherwise they are not.

Here is one possible implementation:

import re

# Unique-ify list, preserving order
def uniquify(l):
    return reduce(lambda s, e: s + ([] if e in s else [e]), l, [])

# Replace all keys in replacements with corresponding values in str
def replace_all(str, replacements):
    for old, new in replacements.iteritems():
        str = str.replace(old, new)
    return str

class Expression:
    units = ["m", "s"]

    def __init__(self, exp):
        self.exp = exp

    # Returns a list of symbols in the expression that are preceded
    # by the given unit, ordered by first appearance. Assumes the
    # symbol and unit are separated by a space. For example:
    # Expression("A m * B s / (A m + C m)").symbols_for_unit("m")
    # returns ['A', 'C']
    def symbols_for_unit(self, unit):
        sym_re = re.compile("(.) %s" % unit)
        symbols = sym_re.findall(self.exp)
        return uniquify(symbols)

    # Returns a string with all symbols that have units other than
    # unit "muted", that is replaced with the empty string. Example:
    # Expression("A m * B s / (A m + C m)").mute_symbols_for_other_units("m")
    # returns "A m *  s / (A m + C m)"
    def mute_symbols_for_other_units(self, unit):
        other_units = "".join(set(self.units) - set(unit))
        return re.sub("(.) ([%s])" % "".join(other_units), " \g<2>", self.exp)

    # Returns a string with all symbols that have the given unit
    # replaced with tokens of the form $0, $1, ..., by order of their
    # first appearance in the string, and all other symbols muted. 
    # For example:
    # Expression("A m * B s / (A m + C m)").canonical_form("m")
    # returns "$0 m *  s / ($0 m + $1 m)"
    def canonical_form(self, unit):
        symbols = self.symbols_for_unit(unit)
        muted_self = self.mute_symbols_for_other_units(unit)
        for i, sym in enumerate(symbols):
            muted_self = muted_self.replace("%s %s" % (sym, unit), "$%s %s" % (i, unit))
        return muted_self

    # Define a permutation, represented as a dictionary, according to
    # the following rule: replace $i with the ith distinct symbol
    # occurring in the expression with the given unit. For example:
    # Expression("C m * B s / (C m + A m)").permutation("m")
    # returns {'$0':'C', '$1':'A'}
    def permutation(self, unit):
        enum = enumerate(self.symbols_for_unit(unit))
        return dict(("$%s" % i, sym) for i, sym in enum)

    # Return a string produced from the expression by first converting it
    # into canonical form, and then performing the replacements defined
    # by the given permutation. For example:
    # Expression("A m * B s / (A m + C m)").permute("m", {"$0":"C", "$1":"A"})
    # returns "C m *  s / (C m + A m)"
    def permute(self, unit, permutation):
        new_exp = self.canonical_form(unit)
        return replace_all(new_exp, permutation) 

    # Test for equality under permutation and muting of all other symbols 
    # than the unit provided. 
    def eq_under_permutation(self, unit, other_exp):
        muted_self = self.mute_symbols_for_other_units(unit)        
        other_permuted_str = other_exp.permute(unit, self.permutation(unit))
        return muted_self == other_permuted_str    

    # Test for equality under permutation. This is done for each of
    # the possible units using eq_under_permutation
    def __eq__(self, other):
        return all([self.eq_under_permutation(unit, other) for unit in self.units])

e1 = Expression("A m * B s / (A m + C m)")
e2 = Expression("C m * B s / (C m + A m)")
e3 = Expression("A s * B s / (A m + C m)")

f1 = Expression("A s * (B s + D s) / (A m + C m)")
f2 = Expression("A s * (D s + B s) / (C m + A m)")
f3 = Expression("D s")

print "e1 == e2: ", e1 == e2 # True
print "e1 == e3: ", e1 == e3 # False
print "e2 == e3: ", e2 == e3 # False

print "f1 == f2: ", f1 == f2 # True
print "f1 == f3: ", f1 == f3 # False

As you pointed out, this checks for string equivalence under permutations without any regard to mathematical equivalence, but it is half the battle. If you had a canonical form for mathematical expressions, you could use this approach on two expressions in canonical form. Perhaps one of sympy's Simplify could do the trick.

Amos Joshua
  • 1,601
  • 18
  • 25
  • This is a well thought out answer with great ideas - I just feel the need to comment that it is NOT a robust solution which works for ALL arbitrary sympy expressions. – D A Oct 26 '15 at 21:10
2

If you pass a string to SymPy's sympify() function, it will automatically create the Symbols for you (no need to define them all).

>>> from sympy import *
>>> x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
>>> sympify("x**2 + cos(x)")
x**2 + cos(x)
>>> sympify("diff(x**2 + cos(x), x)")
2*x - sin(x)
asmeurer
  • 86,894
  • 26
  • 169
  • 240
1

I did it once, in one simulater of mathemathics estudies.. Well, in my case, i knew what were the variables that will be used.

So, i tested the result putting values inside the vars.

A = 10
B = 20
C = 30
m = Math.e
s = Math.pi

And so, we solve:

s1 = 'A m * B s / (A m + C m)'
s2 = 'C m * B s / (C m + A m)'

If s1 != s2, was proved there isn't equivalence

With this method is impossible say that two expressions are equivalent, But you can say that both isn't equivalent

if s1 != s2:
   print "Not equivalent"
else:
   print "Try with another sample"

Well.. I hope that this can help you.

JonatasTeixeira
  • 1,474
  • 1
  • 18
  • 24
  • 1
    I thought about that as well but I'm testing a large number of expressions and can't tolerate errors :-/ – Michael Schubert May 27 '11 at 17:49
  • Also, different multiplication order can give different result with floats. – Rosh Oxymoron May 28 '11 at 10:35
  • Rosh Oxymoron, I'm not sure if it happens in python, cause speaking about the hight level language I guess this kind of error isn't problem in python.. I know about this in C, Pascal, Delphi and until java, but i'm not see errors like that in python.. – JonatasTeixeira May 28 '11 at 15:59
  • 1
    @JonatasTeixeira Order of operations is always a problem with *floating point* values, regardless of the language, because they are imprecise. If your language provides or you are using a library to provide infinite precision (such as a symbolic representation, or a rational definition and you have no irrational values) then order wouldn't matter. With *floating* point, though, propagation of errors will haunt you. – Nathan Feb 11 '14 at 21:14
0

This, like all other answers to date is not a robust solution to the problem, but instead contains more helpful information for our future meticulous friend to solve it.

I provide a difficult example using Euler's Formula https://en.wikipedia.org/wiki/Euler%27s_formula

I am certain all other overflow answers to date will not succeed in my example.

I show that all the suggestions on sympy's website also fail on my example. (https://github.com/sympy/sympy/wiki/Faq)

#SOURCE FOR HELPERS:    https://github.com/sympy/sympy/wiki/Faq 
import sympy
import sympy.parsing.sympy_parser
ExampleExpressionString1 = 'exp( i*( (x0 - 1)*(x0 + 2) ) )'
ExampleExpressionSympy1 = sympy.parsing.sympy_parser.parse_expr(ExampleExpressionString1)

ExampleExpressionString2 = 'i*sin( (x0 - 1)*(x0 + 2) ) + cos( (x0 - 1)*(x0 + 2) )'
ExampleExpressionSympy2 = sympy.parsing.sympy_parser.parse_expr(ExampleExpressionString2)

print '(ExampleExpressionSympy1 == ExampleExpressionSympy2):'
print ' ', (ExampleExpressionSympy1 == ExampleExpressionSympy2)

print '(ExampleExpressionSympy1.simplify() == ExampleExpressionSympy2.simplify()):'
print ' ', (ExampleExpressionSympy1.simplify() == ExampleExpressionSympy2.simplify())

print '(ExampleExpressionSympy1.expand() == ExampleExpressionSympy2.expand()):'
print ' ', (ExampleExpressionSympy1.trigsimp() == ExampleExpressionSympy2.trigsimp())

print '(ExampleExpressionSympy1.trigsimp() == ExampleExpressionSympy2.trigsimp()):'
print ' ', (ExampleExpressionSympy1.trigsimp() == ExampleExpressionSympy2.trigsimp())

print '(ExampleExpressionSympy1.simplify().expand().trigsimp() == ExampleExpressionSympy2.simplify().expand().trigsimp()):'
print ' ', (ExampleExpressionSympy1.simplify().expand().trigsimp() == ExampleExpressionSympy2.simplify().expand().trigsimp())

MORE NOTES:

I suspect this is a difficult problem to solve generically, and robustly. To properly check mathematical equivalence, you not only have to try order permutations, but you also have to have a library of mathematical equivalent transformations and try all those permutations as well.

I do however believe this might be a solvable problem, because Wolfram Alpha seems to have 'alternate expression' section, which seems to do the trick of providing all permutations most of the time on arbitrary expressions using these kinds of equivalences.

IN SUMMATION:

I suggest the following with the expectation that it will break:

import sympy
import sympy.parsing.sympy_parser
Expression.simplify().expand().trigsimp()
D A
  • 3,130
  • 4
  • 25
  • 41