I am trying to generate a dataframe from a list of dictionaries. The list of dictionaries are generated via list comprehension referencing the object.
import pandas as pd
class Foo:
def __init__(self, a, b):
self.a = a
self.b = b
@property
def rep(self):
return {'a': self.a, 'b': self.b}
class Bar:
def __init__(self):
self.container = [Foo('1', '2'), Foo('2', '3'), Foo('3', '4')]
def data(self):
return [x.rep for x in self.container]
class Base:
def __init__(self):
self.all = {'A': [Bar(), Bar(), Bar()], 'B': [Bar(), Bar(), Bar()]}
#
def test(self):
list_of_reps = []
[list_of_reps.extend(b.data()) for bar in [self.all[x] for x in self.all] for b in bar]
pd.DataFrame(list_of_reps)
if __name__ == '__main__':
b = Base()
b.test()
I then use the base class to combine all the dictionaries from the Foo class. This number can be several thousand and as the list grows I see that the conversion to a dataframe is slow as well as the data() method in Bar. Is there a more optimal way to generate this?