1

I am interested to prepare a class object which can inherit the attributes from another class. However, the initialization should be done by using an existing method from another class. One can try:

import pandas as pd

class Data():
    def __init__(self, file):
        self.df = pd.read_csv(file)

if __name__ = "__main__":
    abc = Data("xyz.csv")

In this manner, the instance 'abc.df' will be a DataFrame instance which is initialized by using pandas.read_csv() method, reading the "xyz.csv" file. I wonder how one can implement the initialization such that the 'abc' itself will be the DataFrame instance instead of 'abc.df'? It shall be something like

import pandas as pd

class Data():
    def __init__(self, file):
        self = pd.read_csv(file) # self has all the attributes of a DataFrame instance

if __name__ = "__main__":
    abc = Data("xyz.csv")

Of course, this won't work.

Edit: I found a relevant discussion on this How to subclass Pandas.DataFrame. However, I am still curious whether there is a way to create an instance by using the existing method of another instance.

Community
  • 1
  • 1
  • `def data(file): return pd.read_csv(file)`...!? – deceze Jun 15 '15 at 20:06
  • 1
    Why do you need to write such a class at all, instead of just calling `read_csv` and using the result? – BrenBarn Jun 15 '15 at 20:12
  • deceze, BrenBarn: The purpose of doing so is not restricted to pandas only. My intention is to initialize an instance which has all the attributes and methods from the existing instance. On top of that I could add on additional attributes and methods for the other purposes if needed. – Jing-Qiang Goh Jun 16 '15 at 05:54
  • deceze: How does declaring a separate function can help to initialize an instance that meets the requirement as mentioned? – Jing-Qiang Goh Jun 16 '15 at 19:11

1 Answers1

0

You could subclass Data from pandas.DataFrame, and in the constructor use from_csv to read the csv data (not verified):

import pandas as pd

class Data(pd.DataFrame):
    def __init__(self, file):
         super(Data, self).__init__()
         self.from_csv(file)

if __name__ = "__main__":
abc = Data("xyz.csv")
gdh
  • 498
  • 3
  • 14
  • Thank you for the suggestion. I may have done it the wrong way when I tried to subclass 'Data' from 'pandas.DataFrame' – Jing-Qiang Goh Jun 16 '15 at 19:47
  • I tried to use `super().from_csv(file)`, within the `__init__(self, file)`. The programme does not complain if I create an instance by `abc = Data("xyz.csv")` However, it will give a problem RuntimeError: maximum recursion depth exceeded when I try to print `abc.shape` Furthermore, `from_csv` has a slightly different behaviour as compared to `read_csv`. Refer to this discussion. – Jing-Qiang Goh Jun 16 '15 at 20:01
  • Unfortunately, the updated answer does not work such that we can't access the attributes such as print(abc.shape), or print(abc.columns) in this manner. – Jing-Qiang Goh Jun 16 '15 at 20:27
  • Updated, see also http://stackoverflow.com/questions/13229750/how-to-add-attributes-to-a-subclass-of-pandas-dataframe . The thing might not be supported: https://github.com/pydata/pandas/issues/2485 – gdh Jun 16 '15 at 21:05
  • Thank you for your suggestion. As mentioned, for Pandas does not support this kind of subclassing, and we can't access the attributes of a DataFrame by following the suggested answer. Perhaps it is not possible for me to initialize an instance object by using an existing method of another instance in Python. – Jing-Qiang Goh Jun 20 '15 at 12:40