How to evaluate line of data in python

Question

I have a file (data com 300 lines) which contains expressions describing a combination of a set of values and their associated uncertainties. Like:

(18.13053098972887 +/- 0.9408208676613867) + (4.198532451786269 +/- 1.006181051488966)
(11.64429613156244 +/- 0.8494858154859093) / (9.363430186640471 +/- 1.426559761587884)
(8.380090296880461 +/- 0.7207204182841811) - (14.66227215207273 +/- 1.695262722671032) (14.89348010703303 +/- 0.5526766418243718) - (11.38945635408163 +/- 0.6236755388832851)
(7.799527276109394 +/- 0.2574050770442082) + (16.72086955979466 +/- 1.110203643562272)
(9.608384727728524 +/- 0.4631992350332439) + (10.37543680415251 +/- 1.163914272905167) (4.18157352367416 +/- 0.6524763186462224) / (10.3161260811612 +/- 1.110916984908603)  (3.776332299345897 +/- 1.075189965311438) - (11.53599019583698 +/- 0.7387217730283737) (14.93653570161044 +/- 1.60794403241016) / (11.83556322752483 +/- 0.466637508245185) 
(13.85353967547417 +/- 0.9246529699786543) / (14.20790420838551 +/- 0.3679686461109668) (20.63305806977545 +/- 1.545379194198176) * (10.95731035336255 +/- 1.434931108092665)  (20.80371993819163 +/- 1.273124703682392) + (15.93093231553212 +/- 0.5784831928649479)
(13.61241819963037 +/- 0.04285690967801981) - (7.682740816076352 +/- 0.4521933933719993)

Each line comprises one expression. The format of each term in the expression is (value +/- uncertainty), including the brackets. They are combined via addition, subtraction, multiplication and division. The usual operator precedence applies, that is: division and multiplication take precedence over addition and subtraction. No parentheses are used beyond those containing single expressions.

I need to evaluate each expression and write the result with its uncertainty to an output file with one result per line. Finally, calculate the sum of all expressions and its associated uncertainty.

Does anyone can help me with that?

Some of your lines appear to have more than one expression; is that a copy-and-paste error perhaps? — Martijn Pieters, Feb 06 '14 at 13:08
That is not an error, they have really different number of expressions in most part of the lines — Mac, Feb 06 '14 at 13:17
I guess you will have to implement your own parser (where you will define how +/- and other operators work with each other). Not easy at all. Have a look at [Python Lex-Yacc](http://www.dabeaz.com/ply/). — freakish, Feb 06 '14 at 13:18
Go with what @freakish said: The `ply` library is really easy to use if you "grok" Lexx/Yacc. And will leave you with the most flexible and maintainable solution. — Daren Thomas, Feb 06 '14 at 13:24

Daren Thomas · Accepted Answer · 2014-02-06T14:24:31.653

Here is a starting point:

from ply import lex
from ply import yacc

# lexer
tokens = ('OPEN_PAREN', 'NUMBER', 'CLOSE_PAREN', 'ADD', 'SUB', 'MUL', 'DIV')

t_OPEN_PAREN = r'\('
t_NUMBER = r'\d+\.\d+'
t_CLOSE_PAREN = r'\)'
t_ADD = r'\+'
t_SUB = r'-'
t_MUL = r'\*'
t_DIV = r'/'

t_ignore = ' \t'


# parser
def p_expression_add(p):
    'expression : expression ADD term'
    p[0] = ('+', p[1], p[3])


def p_expression_sub(p):
    'expression : expression SUB term'
    p[0] = ('-', p[1], p[3])


def p_expression_term(p):
    'expression : term'
    p[0] = p[1]


def p_term_mul(p):
    'term : term MUL factor'
    p[0] = ('*', p[1], p[3])


def p_term_div(p):
    'term : term DIV factor'
    p[0] = ('/', p[1], p[3])


def p_term_factor(p):
    'term : factor'
    p[0] = p[1]


def p_factor(p):
    'factor : OPEN_PAREN number ADD DIV SUB number CLOSE_PAREN'
    p[0] = (p[2], p[6])


def p_number(p):
    'number : NUMBER'
    p[0] = float(p[1])


# oh, and handle errors
def p_error(p):
    raise SyntaxError("Syntax error in input on line %d" % lex.lexer.lineno)


def parse(input):
    '''
    parses a string with the contents of the idf file and returns the dictionary
    representation.
    '''
    lexer = lex.lex(debug=True)
    lexer.input(input)
    parser = yacc.yacc()
    result = parser.parse(debug=True)
    return result

if __name__ == '__main__':
    #result = parse('(18.13053098972887 +/- 0.9408208676613867)')
    result = parse('(18.13053098972887 +/- 0.9408208676613867) + (4.198532451786269 +/- 1.006181051488966)')
    print result

This script will produce the following output: (note, I stripped the debugging output)

('+', (18.13053098972887, 0.9408208676613867), (4.198532451786269, 1.006181051488966))

Your next steps would be:

extend the parser to handle multiple expressions per line
- you might want to define the semantics a bit better than in your question
go through the result and interpret it
- how does adding uncertainty affect the result?

You can also use the parser to do the math, if for instance adding to uncertain numbers is the same as adding their values and their uncertainties:

def p_expression_add(p):
    'expression : expression ADD term'
    p[0] = (p[1][0] + p[3][0], p[1][1] + p[3][1])

If you replace p_expression_add like this, then the answer produced will be:

(22.32906344151514, 1.9470019191503527)

Have fun! Simple parsing like this is an incredibly useful tool to have at your disposal, as a lot of tricky problems get easier when looked at as parsing problems.

score 0 · Answer 2 · edited May 23 '17 at 12:11

0

well I know eval is bad and all, but you could try ast.literal_eval(). Ast literal eval only works for smaller python commands. Using python's eval() vs. ast.literal_eval()?

import ast

for line in file:
    value = ast.literal_eval(line)

you may have to do some string conversions to floats.

edited May 23 '17 at 12:11

Community

1
1

answered Feb 06 '14 at 14:30

justengel

6,132
4
26
42

How to evaluate line of data in python

2 Answers2