Hello Stackoverflow folks,... I hope this questions is not already answered. After half a day of googeling I did resign myself to asking a question here. My problem is the following:
I want to create a class which takes some information and processes this information:
#Klassendefinition für eine Instanz von Rohdaten
class raw_data():
def __init__(self, filename_rawdata, filename_metadata,
file_format, path, category, df_raw, df_meta):
self.filename_rawdata = filename_rawdata
self.filename_metadata = filename_metadata
self.file_format = file_format
self.path = path
self.category = category
self.df_raw = getDF(self.filename_rawdata)
self.df_meta = getDF(self.filename_metadata)
# generator
def parse(self, path):
g = gzip.open(path, 'rb')
for l in g:
yield eval(l)
# function that returns a pandas dataframe with the data
def getDF(self, filename):
i = 0
df = {}
for d in self.parse(filename):
df[i] = d
i += 1
return pd.DataFrame.from_dict(df, orient='index')
Now I have a problem with the init method, I would like to run the class method below on default when the class in instantiated, but I somehow cannot manage to get this working. I have seen several other posts here like [Calling a class function inside of __init__ [1]: Python 3: Calling a class function inside of __init__ but I am still not able to do it. The first question did work for me, but I would like to call the instance variable after the constructor ran.
I tried this:
class raw_data():
def __init__(self, filename_rawdata, filename_metadata,
file_format, path, category):
self.filename_rawdata = filename_rawdata
self.filename_metadata = filename_metadata
self.file_format = file_format
self.path = path
self.category = category
getDF(self.filename_rawdata)
getDF(self.filename_metadata)
# generator
def parse(self, path):
g = gzip.open(path, 'rb')
for l in g:
yield eval(l)
# function that returns a pandas dataframe with the data
def getDF(self, filename):
i = 0
df = {}
for d in self.parse(filename):
df[i] = d
i += 1
return pd.DataFrame.from_dict(df, orient='index')
But I get an error because getDF is not defined (obviously).. I hope this questions is not silly by any means. I need to do it that way, because afterwards I want to run like 50-60 instance calls and I do not want to repeat like Instance.getDF() ... for every instance, but rather would like to have it called directly.