How to avoid using 'eval' function to loop though a list of attributes

Question

I have a pandas dataframe with a column containing a class with a bunch of attributes. I wish to expand some of those attributes into new columns. I have some working code but it looks a bit nasty and uses an eval. What is the more pythonic way to do this

import pandas as pd

#Boilerplate for minimal, reproducible example
class cl:
    class inner:
        na1 = "nested atribute one"
        na2 = "nested atribute two"
    def __init__(self, name):
        self.name = name
    a1 = "atribute one"
    a2 = "atribute one"
    inner_atts = inner()


class_object1 = cl("first")
class_object2 = cl("second")
data = [class_object1,class_object2]
data_frame = pd.DataFrame(data,columns=['class object'])
####################
info_to_get = {'name','a1','a2','inner_atts.na1','inner_atts.na2'}

for x in info_to_get:
    sr = 'y.{0}'.format(x)
    data_frame['{0}'.format(x)] = data_frame['class object'].apply(lambda y: eval(sr,{'y':y}))

print(data_frame)

This is a bad way to use a pandas dataframe. – cs95 Jun 07 '19 at 02:32 — cs95, Jun 07 '19 at 02:32

score 2 · Accepted Answer · answered Jun 07 '19 at 02:29

Use operator.attrgetter:

import operator

info_to_get = list(info_to_get)
df[info_to_get] = pd.DataFrame(df['class object'].apply(operator.attrgetter(*info_to_get)).tolist())

Output:

                             class object       inner_atts.na1  \
0  <__main__.cl object at 0x7f08002d27b8>  nexted atribute one   
1  <__main__.cl object at 0x7f08002d2a90>  nexted atribute one   

        inner_atts.na2            a2   name            a1  
0  nexted atribute two  atribute one  first  atribute one  
1  nexted atribute two  atribute one    two  atribute one

cs95 · Answer 2 · 2019-06-07T02:58:04.710

The first thing to understand about pandas is that it is not suited to storing and working with anything it can't vectorize - there is a lot of overhead and you are better off using lists and loops to iterate over them.

That said, I would do this using a list comprehension.

from operator import attrgetter

f = attrgetter(*info_to_get)
pd.DataFrame([f(c) for c in df['class object']], columns=info_to_get)

        inner_atts.na2    name            a2       inner_atts.na1            a1
0  nexted atribute two   first  atribute one  nexted atribute one  atribute one
1  nexted atribute two  second  atribute one  nexted atribute one  atribute one

Evidence suggests you get the most speedup working with non-vectorizable data using list comps.

How to avoid using 'eval' function to loop though a list of attributes

2 Answers2