I have a pandas dataframe df
that I want to reorganize by column names and elements to create and new dataframe df
. For example
import numpy as np
import pandas as pd
np.random.seed(10)
df = pd.DataFrame()
df['date'] = pd.date_range(start='2021-01-01', end='2021-04-01', freq='D')
df = df.set_index('date')
df['speciesA'] = np.random.randint(2, size=len(df))
df['speciesB'] = np.random.randint(3, size=len(df))
# new df0, set table elements as columns
df0 = pd.DataFrame()
df0['date'] = df.index
df0 = df0.set_index('date')
for k in df.columns.tolist(): # iterate over columns
for jj in range(len(df)): # iterate over each row
cell = df.iloc[jj] # cell value
# iterate over table elements
newcolumnname = str(k)+'_'+str(cell[k])
df0[newcolumnname] = 0
df0.iloc[jj][newcolumnname] = 1
#df0.iloc[jj][newcolumnname] = df.iloc[jj][str(k)]
print(df0.head())
where the original dataframe df
has the form
speciesA speciesB
date
2021-01-01 1 2
2021-01-02 1 0
2021-01-03 0 2
2021-01-04 1 2
2021-01-05 0 0
I want to create a new dataframe df0
,
speciesA_1 speciesA_0 speciesB_2 speciesB_0 speciesB_1
date
2021-01-01 1 0 1 0 0
2021-01-02 1 0 0 1 0
2021-01-03 0 1 1 0 0
2021-01-04 1 0 1 0 0
2021-01-05 0 1 0 1 0
Note that df0
column names (eg. speciesA_1
) consist of df
column name speciesA
and element value 1
. So the corresponding df0
elements indicate True/False.
My code above gives an error A value is trying to be set on a copy of a slice from a DataFrame
. I don't understand why this is happening or how to fix it.