0

I have pandas DataFrame df with different types of columns, some values of df are NaN.

To test some assumption, I create copy of df, and transform copied df to (0, 1) with pandas.isnull():

df_copy = df
for column in df_copy:
    df_copy[column] = df_copy[column].isnull().astype(int)

but after that BOTH df and df_copy consist of 0 and 1. Why this code transforms df to 0, 1 and is there way to prevent it?

tima
  • 51
  • 1
  • 1
  • 8
  • `df_copy = df` **never creates a copy in Python.** This is quite important to understand, generally, so you really should read: https://nedbatchelder.com/text/names.html – juanpa.arrivillaga Jul 08 '18 at 18:49

2 Answers2

2

You can prevent it declaring:

df_copy = df.copy()

This creates a new object. Prior to that you essentially had two pointers to the same object. You also might want to check this answer and note that DataFrames are mutable.

Btw, you could obtain the desired result simply by:

df_copy = df.isnull().astype(int)
Quickbeam2k1
  • 5,287
  • 2
  • 26
  • 42
0

even better memory-wise

for column in df:
    df[column + 'flag'] = df[column].isnull().astype(int)
ℕʘʘḆḽḘ
  • 18,566
  • 34
  • 128
  • 235