I have a df that looks like this:
column1 column2 column3 column4
1 2 nan 4
1 2 3 nan
1 2 nan nan
1 2 nan nan
How do I reshape the dataframe, for every cell with NaN drop it, and if the column has the same value only take one instance of it?
New df should look like this;
column1 column2 column3 column4
1 2 3 4
I have roughly 500 columns with spotty data like this.
Edit:
I used this line of code to move the values with the spotty data into one row.
df = df.apply(lambda x: pd.Series(x.dropna().values))
new df looks like this:
column1 column2 column3 column4
1 2 3 4
1 2 nan nan
1 2 nan nan
1 2 nan nan
Then I drop the duplicates:
df = df.drop_duplicates()
df looks like this now:
column1 column2 column3 column4
1 2 3 4
1 2 nan nan
Not sure why the Nan are not dropping after this point with but the rows are dropping:
pivoted_df = pivoted_df.dropna()