2

How to summing and subtracting 2 numbers in 1 column?

  bedrooms
0 1 + 1
1 2 - 1

If I'm using this code

df['bedrooms'] = pd.eval(df['bedrooms'])

will get this error message

Traceback (most recent call last):

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "<ipython-input-11-d8172a031240>", line 1, in <module>
    df['bedrooms'] = pd.eval(df['bedrooms'])

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/computation/eval.py", line 322, in eval
    parsed_expr = Expr(expr, engine=engine, parser=parser, env=env, truediv=truediv)

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/computation/expr.py", line 830, in __init__
    self.terms = self.parse()

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/computation/expr.py", line 847, in parse
    return self._visitor.visit(self.expr)

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/computation/expr.py", line 437, in visit
    raise e

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/computation/expr.py", line 431, in visit
    node = ast.fix_missing_locations(ast.parse(clean))

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ast.py", line 35, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)

  File "<unknown>", line 1
    [4 ,4 +1 ,2 ,0 ,5 ,4 +1 ,2 -3 ,5 ,3 ,7 ,4 ,3 ,3 ,5 +1 ,4 ,2 ,1 +1 ,5 +1 ,3 ,5 ,4 +1 ,3 ,2 ,4 ,4 ,4 ,3 +1 ,1 -2 ,2 ,1 ,6 ,4 ,1 ,6 +1 ,3 -4 ,6 +1 ,2 +1 ,3 ,0 -4 ,2 +2 ,3 +1 ,4 +1 ,6 ,4 ,3 ,3 +1 ,4 ,4 +1 ,3 +1 ,4 ,4 +1 ,1 -3 ,3 ,3 ,3 -4 ,3 ,3 ,2 ,5 ,4 +1 ,3 ,4 ,3 -5 ,4 +1 ,4 +1 ,1 ,4 ,4 ,4 ,4 ,4 +1 ,4 +1 ,4 ,4 ,6 +,1 -5 ,5 ,5 ,4 -5 ,6 +1 ,3 ,4 ,3 ,5 +1 ,6 ,5 +1 ,5 ,5 +1 ,5 +1 ,4 ,4 +1 ,3 ,3 ,4 ,3 +1 ,5 +1 ,4 ,4 +1 ,4 ,3 -5 ,...]
                                                                                                                                                                                                                                                                                                                             ^
SyntaxError: invalid syntax

I just found out these are a list of numbers can't be parsed.

74      6+
441     7+
459     4+
518     5+
558     5+
610     3+
990     5+
1585    7+
Name: bedrooms, dtype: object
halfer
  • 19,824
  • 17
  • 99
  • 186
Nurdin
  • 23,382
  • 43
  • 130
  • 308
  • 2
    `df['total'] = pd.eval(df['bedrooms'])` ..? – Chris Adams Jan 21 '20 at 14:07
  • might need to strip trailing "+" and "-" first... something like `pd.eval(df['bedrooms'].str.strip('+- '))` – Chris Adams Jan 21 '20 at 14:36
  • still happened... – Nurdin Jan 21 '20 at 14:38
  • Mohammed: a quick reminder that "please help" begging is not necessary here, and will be removed, and that "politeness" added to questions when you know brevity is required is not polite at all. Similarly, "TQ" may be short for "thank you" on Twitter, but it is not a word here, and is best not added. – halfer Jan 23 '20 at 09:21
  • Remember that editors are volunteers here, and when you give them work to do wilfully, you are wasting their valuable time that could be spent improving other material. The amount of woeful material that comes in daily is high, and we do not have enough (good quality) editors to keep up. – halfer Jan 23 '20 at 09:23

1 Answers1

3

I believe you need pandas.eval:

df['new'] = pd.eval(df['bedrooms'])
print (df)
  bedrooms  new
0    1 + 1    2
1    2 - 1    1

EDIT: Problem in data is 6 +, one possible solution for parse it to 6 is use Series.str.rstrip:

df = pd.DataFrame({'bedrooms': "4 ,4 +,5 +1, 5+, 6+ ".split(',') * 200})

df['bedrooms'] = pd.eval(df['bedrooms'].str.rstrip('+- '))

Or:

df['bedrooms'] = df['bedrooms'].str.rstrip('+- ').apply(pd.eval)
print (df)
     bedrooms
0           4
1           4
2           6
3           5
4           6
..        ...
995         4
996         4
997         6
998         5
999         6

[1000 rows x 1 columns]

EDIT1:

You can find problematic values:

def f(x):
    try:
        return pd.eval(x)
    except:
        return np.nan

df['bedrooms1'] = df['bedrooms'].apply(f)

a = df.loc[df['bedrooms1'].isna(), 'bedrooms']
print (a)
74    6 +
Name: bedrooms, dtype: object
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252