1

I have the following example:

import pandas as pd
from copy import copy, deepcopy

class DataFrameWrapper(pd.DataFrame):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def __eq__(self, other):
        return self.equals(other)


t1 = DataFrameWrapper(pd.DataFrame({'a': [1, 2, 3]}))
t2 = deepcopy(t1)
t3 = copy(t1)

print(type(t1), ' ', type(t2), ' ', type(t3))

Output:

<class 'DataFrameWrapper'>   <class 'pandas.core.frame.DataFrame'>   <class 'pandas.core.frame.DataFrame'>

Is anyone able to tell me why copy and deepcopy are modifying the type of t1?

The purpose of the DataFrameWrapper class is simply to allow me to do a == between pandas DataFrames.

Sagar Gupta
  • 1,352
  • 1
  • 12
  • 26
Taran
  • 265
  • 1
  • 11

1 Answers1

0

This is happening because copy functionality has been re-defined in pandas to return base type. For your purpose, you can do this:

class DataFrameWrapper(pd.DataFrame):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def __eq__(self, other):
        return self.equals(other)

    def __copy__(self):
        return DataFrameWrapper(super().__copy__())

    def __deepcopy__(self, memo=None):
        return DataFrameWrapper(super().__deepcopy__(memo))

OR

t1 = DataFrameWrapper(pd.DataFrame({'a': [1, 2, 3]}))
t2 = DataFrameWrapper(deepcopy(t1))
t3 = DataFrameWrapper(copy(t1))

You can refer to How to override the copy/deepcopy operations for a Python object? to see how defining __copy__() and __deepcopy()__ works, same as done inside pandas library in pandas/core/generic.py

To test out the above theory (and reproduce the behavior with pandas), here's a test code:

from copy import copy, deepcopy

class Base:
    def __copy__(self):
        return Base()

    def __deepcopy__(self, memodict={}):
        return Base()


class Inherited(Base):
    pass


B = Base()
I = Inherited()
I_copy = copy(I)
I_deepcopy = deepcopy(I)

print('Type B :', type(B))
print('Type I :', type(I))
print('Type I_copy :', type(I_copy))
print('Type I_deepcopy :', type(I_deepcopy))

Which outputs:

Type B : <class '__main__.Base'>
Type I : <class '__main__.Inherited'>
Type I_copy : <class '__main__.Base'>
Type I_deepcopy : <class '__main__.Base'>
Sagar Gupta
  • 1,352
  • 1
  • 12
  • 26