I'd like to score very large files using models built in R
.
The idea is to extract the actual predictor equation from the R model object and define a python string containing the equation.
The predictor header of the large predictor file has the same predictor names as those used to build the model (model development and model scoring predictors were generated using the same python code).
I'd like to score the large predictor file with python (thereby avoiding the need to split/chunk the predictor file to allow R processing, even if R
's predict
function is really an attractive alternative).
So I've checked How do I execute a string containing Python code in Python? and other posts. Since eval
and exec
are frowned upon in the python community, I am wondering what's the most pythonic way to dynamically apply an equation to a set of predictors stored in a csv file. Thanks.
import csv
import StringIO
predfile = StringIO.StringIO(
'''x1,x2
1,2
3,4''')
eq = '1 + 2*x1 + 3*x2'
reader = csv.reader( predfile , delimiter=',' )
header = reader.next()
for row in reader:
exec("{0}={1}".format(header[0],row[0]))
exec("{0}={1}".format(header[1],row[1]))
exec("yhat={0}".format(eq))
print yhat