11

Is it possible to work with pandas DataFrame as with an Excel spreadsheet: say, by entering a formula in a column so that when variables in other columns change, the values in this column change automatically? Something like:

a  b  c
2  3  =a+b

And so when I update 2 or 3, the column c also updates automatically.

PS: It's clearly possible to write a function to return a+b, but is there any built-in functionality in pandas or in other Python libraries to work with matrices this way?

Anton Tarasenko
  • 8,099
  • 11
  • 66
  • 91
  • 1
    For more information on the `pd.eval()` family of functions, their features and use cases, please visit [Dynamic Expression Evaluation in pandas using pd.eval()](https://stackoverflow.com/questions/53779986/dynamic-expression-evaluation-in-pandas-using-pd-eval). – cs95 Dec 16 '18 at 04:49

3 Answers3

14

This will work in 0.13 (still in development)

In [19]: df = DataFrame(randn(10,2),columns=list('ab'))

In [20]: df
Out[20]: 
          a         b
0  0.958465  0.679193
1 -0.769077  0.497436
2  0.598059  0.457555
3  0.290926 -1.617927
4 -0.248910 -0.947835
5 -1.352096 -0.568631
6  0.009125  0.711511
7 -0.993082 -1.440405
8 -0.593704  0.352468
9  0.523332 -1.544849

This will be possible as 'a + b' (soon)

In [21]: formulas = { 'c' : 'df.a + df.b' }

In [22]: def update(df,formulas):
               for k, v in formulas.items():
                  df[k] = pd.eval(v)


In [23]: update(df,formulas)

In [24]: df
Out[24]: 
          a         b         c
0  0.958465  0.679193  1.637658
1 -0.769077  0.497436 -0.271642
2  0.598059  0.457555  1.055614
3  0.290926 -1.617927 -1.327001
4 -0.248910 -0.947835 -1.196745
5 -1.352096 -0.568631 -1.920726
6  0.009125  0.711511  0.720636
7 -0.993082 -1.440405 -2.433487
8 -0.593704  0.352468 -0.241236
9  0.523332 -1.544849 -1.021517

You could implement a hook into setitem on the data frame to have this type of function called automatically. But pretty tricky. You didn't specify how the frame is updated in the first place. Would probably be easiest to simply call the update function after you change the values

Jeff
  • 125,376
  • 21
  • 220
  • 187
12

I don't know it it is what you want, but I accidentally discovered that you can store xlwt.Formula objects in the DataFrame cells, and then, using DataFrame.to_excel method, export the DataFrame to excel and have your formulas in it:

import pandas
import xlwt

formulae=[]
formulae.append(xlwt.Formula('SUM(F1:F5)'))
formulae.append(xlwt.Formula('SUM(G1:G5)'))
formulae.append(xlwt.Formula('SUM(H1:I5)'))
formulae.append(xlwt.Formula('SUM(I1:I5)'))

df=pandas.DataFrame(formula)
df.to_excel('FormulaTest.xls')

Try it...

jtornero
  • 129
  • 2
  • 4
    It doesn't look like the OP wanted to export something to Excel. The post is about imitating the Excel behavior using pandas DataFrame. – Artemix Sep 27 '13 at 10:24
  • 5
    Though @jtornero answered a different question, he answered mine, for which I'm grateful! – sferencik Feb 01 '18 at 23:00
3

There's currently no way to do this exactly in the way that you describe.

In pandas 0.13 there will be a new DataFrame.eval method that will allow you to evaluate an expression in the "context" of a DataFrame. For example, you'll be able to df['c'] = df.eval('a + b').

Phillip Cloud
  • 24,919
  • 11
  • 68
  • 88
  • does the string in eval support functions , like, df['c'] = df.eval('a.diff() + b') ? – bigbug Sep 29 '13 at 06:33
  • No. Store the result of a function call in a variable. – Phillip Cloud Sep 29 '13 at 16:45
  • Is it possible to add function support in string in next version?Even though not bring any performance upgrade. For that will let dataframe accept the user input expression and caculate the result accordingly. (my current solution is using pyparsing to parse and modify the expression, then exec it, somewhat ugly) – bigbug Oct 03 '13 at 15:34
  • Maybe. I'm concerned about possible security issues with allowing arbitrary function calls. – Phillip Cloud Oct 03 '13 at 16:08
  • Even python offer eval and exec, so the security is depend on how to use it . dataframe.eval only support arithmatic is too limited. – bigbug Oct 03 '13 at 16:21
  • It's not only arithmetic, it supports a fairly comprehensive subset of Python expressions. Read the [documentation](http://pandas.pydata.org/pandas-docs/dev/enhancingperf.html#supported-syntax). – Phillip Cloud Oct 03 '13 at 17:00