-2

Is there a way to evaluate expressions from strings that include human readable number units?

For example:

myformula='1u+1e-6'
result = eval(myformula)

... should be equivalent to 1e-6+1e-6 (where u=micro).

Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
  • 2
    how is `1u` more readable than `1e-6` ... which humans are you refering too ? – Joran Beasley Feb 06 '15 at 22:27
  • 2
    Probably a nice learning exercise to create this kind of parser on your own. (surely not required in real life!) – Dr. Jan-Philip Gehrcke Feb 06 '15 at 22:27
  • @JoranBeasley: From the choice of notation, it sounds that would be scientists and engineers - who, are very much human as are we. ;) – jrd1 Feb 07 '15 at 00:56
  • and they dont understand `1e-6`? or that is somehow more confusing than `1u`? – Joran Beasley Feb 07 '15 at 00:57
  • @JoranBeasley: yes, of course they do - I'm not arguing that. To me, it's clear that it's lazier and easier to write something in a familiar notation, as opposed to having to type more. And, in the context of programming about being productively lazy, to me his need for that makes sense. – jrd1 Feb 07 '15 at 01:00
  • 1
    speaking as a scientist, I would never just give a 'micro' or 'nano' or whatever unless the units were attached - 'microliter', 'kilogram', etc. I've never seen anyone just using the prefix... – MattDMo Feb 07 '15 at 01:30
  • 1
    Why 1u and and not 1µ? :) – PM 2Ring Feb 07 '15 at 05:13
  • well, it's preference of use for certain tools, in this case here I'm talking about SPICE circuit simulation, in that world people just prefer to use the symbolic expressions rather than scientific – sam.freeman Feb 07 '15 at 08:29

3 Answers3

6

This answer expands somewhat on Joran's to replace all SI affices with the appropriate exponents:

import re

SI = {
    "T": 12,
    "G": 9,
    "M": 6,
    "k": 3,
    "h": 2,
    "da": 1,
    "d": -1,
    "c": -2,
    "m": -3,
    "u": -6,
    "n": -9,
    "p": -12,
}

SI_REGEX = re.compile(r"(?<=\d)(%s)\b" % "|".join(SI))

def repl_si(match):
    return "e%d" % SI[match.group()]

def defix(formula):
    return re.sub(SI_REGEX, repl_si, formula)

Using the dictionary SI, we create a regular expression that will match any of the keys in SI as long as they're preceded by a digit and followed by a word boundary:

(?<=\d)(T|G|M|k|h|da|d|c|m|u|n|p)\b

Next, we define a substitution function repl_si() that looks up the match in SI and replaces it with "e" concatenated with the exponent.

Then, all we have to do is write a function that calls re.sub() appropriately with the regex, substitution function and formula, and voilà:

>>> defix("1T + 2G + 3M + 4k + 5h + 6da + 7d + 8c + 9m + 1u + 2n + 3p")
'1e12 + 2e9 + 3e6 + 4e3 + 5e2 + 6e1 + 7e-1 + 8e-2 + 9e-3 + 1e-6 + 2e-9 + 3e-12'

Now all you need to do is call eval() on the result, which of course you should absolutely never do with user-supplied input.

Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
  • thanks, how can I force conversion only if the unit trails a number directly [0-9], I feel that the difference can be in the SI_REGEX expression – sam.freeman Feb 07 '15 at 08:29
  • Here's a safe approach to processing user input math expressions in python. http://stackoverflow.com/questions/26505420/evaluate-math-equations-from-unsafe-user-input-in-python – Håken Lid Feb 07 '15 at 09:02
  • @sam.freeman: Yes, that is possible, using a lookbehind assertion in the regular expression. Something like this should work: `SI_REGEX = re.compile(r"(?<=\d)(%s)" % "|".join(SI))` – Håken Lid Feb 07 '15 at 09:11
  • @sam.freeman I've incorporated Håken's suggestion so that substitutions are only made when the affix is preceded by a digit. – Zero Piraeus Feb 07 '15 at 14:38
  • You can also use a word boundary `\b` after the to make the expression even stricter. Then it will work with a regular dictionary as well. `r"(?<=\d)(%s)\b" % "|".join(SI)` – Håken Lid Feb 07 '15 at 20:46
  • @HåkenLid Hahaha yes, good point. Edited; thanks again! – Zero Piraeus Feb 07 '15 at 21:02
0
myformula='1u+1e-6'
result = eval(re.sub("(\d+)u","\\1e-6",myformula))

should work... by just replacing any digits followed immediatly by u with e-6 before evaluating

Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
0

Adding an answer, mostly inspired by Joran, it probably isn't as elegant as Zero's but I think it does the job one thing I made sure that the unit MUST be preceded by a numeric figure (\d+)

I fixed the units by processing a series of substitutes over different units

myformula = '1E+3P+0.5T-7G-6M+2.5k-1m+3.7u+4n+13p-59f-73a+0.5e-5'
tmp_exp = re.sub('(\d+)E', r'\1e18', myformula)
tmp_exp = re.sub('(\d+)P', r'\1e15', tmp_exp)
tmp_exp = re.sub('(\d+)T', r'\1e12', tmp_exp)
tmp_exp = re.sub('(\d+)G', r'\1e9', tmp_exp)
tmp_exp = re.sub('(\d+)M', r'\1e6', tmp_exp)
tmp_exp = re.sub('(\d+)k', r'\1e3', tmp_exp)
tmp_exp = re.sub('(\d+)m', r'\1e-3', tmp_exp)
tmp_exp = re.sub('(\d+)u', r'\1e-6', tmp_exp)
tmp_exp = re.sub('(\d+)n', r'\1e-9', tmp_exp)
tmp_exp = re.sub('(\d+)p', r'\1e-12', tmp_exp)
tmp_exp = re.sub('(\d+)f', r'\1e-15', tmp_exp)
tmp_exp = re.sub('(\d+)a', r'\1e-18', tmp_exp)

tmp_exp will come out as 1e18+3e15+0.5e12-7e9-6e6+2.5e3-1e-3+3.7e-6+4e-9+13e-12-59e-15-73e-18+0.5e-5

  • The main difference between your solution and mine is that it has to process the formula multiple times, rather than just once. Unless performance is an issue, that's not a problem, but yes, it does feel somewhat inelegant to write nearly the same line of code a dozen times. There's nothing to stop you iterating over a dict instead, though ... – Zero Piraeus Feb 07 '15 at 14:47