0

I have this dataframe:

    ALPHA  DELTA BETA GAMMA  
0     #     1    NaN   NaN   
1     #    NaN    2     5     
2     #    NaN   NaN    3    
3     #     8     4     5  

The objective is to add the name of one or more columns already present by duplicating the rows that have several values on the same row in this dataframe.

The objective is to obtain this dataframe:

NEW  ALPHA DELTA BETA GAMMA  
DELTA #     1    NaN   NaN   
BETA  #    NaN    2     5
GAMMA #    NaN    2     5      
GAMMA #    NaN   NaN    3    
DELTA #     8     4     5  
BETA  #     8     4     5   
GAMMA #     8     4     5       

I don't know where to start. Can you help please ? Thanks !

  • Does this answer your question? [Get all the rows with and without NaN in pandas dataframe](https://stackoverflow.com/questions/54061265/get-all-the-rows-with-and-without-nan-in-pandas-dataframe) with [Replicating rows in a pandas data frame by a column value](https://stackoverflow.com/questions/26777832/replicating-rows-in-a-pandas-data-frame-by-a-column-value) – Yevhen Kuzmovych Feb 15 '23 at 15:47
  • Are you trying to add new column with name 'NEW' or you want to replace index column with NEW? – Pruthvi Feb 15 '23 at 15:54
  • Yes, I am trying to add new column name 'NEW' ! @Pruthvi – Adrien Lambert Feb 15 '23 at 15:54
  • Why ALPHA = '#'? and why not NaN? – Corralien Feb 15 '23 at 16:13

2 Answers2

0

First, we need to add a new column with its values. We will use insert method here, to place it in the starting first position

df.insert(0, "NEW", ['DELTA', 'BETA', 'GAMMA', 'GAMMA'])

Now to get all values of the last row for new rows, we will use iloc method. Then, we will change the value of NEW column in new row as it is different.

new_row = df.iloc[[-1]]
new_row["NEW"] = "DELTA"
df = df.append(new_row, ignore_index=True)

Note that you are adding new rows manually here. You can automate it through loop if you have too many rows. You can even do this process in reverse like add rows first and then columns but it will create complicate logic.

Pruthvi
  • 85
  • 10
0

Your input is not really clear but you can use:

# .iloc[:, 1:] and .columns[1:] to skip ALPHA column
>>> (df.assign(NEW=df.iloc[:, 1:].notna()
                     .dot(df.columns[1:] + ',')
                     .str.split(',').str[:-1])
       .explode('NEW'))

  ALPHA  DELTA  BETA  GAMMA    NEW
0     #    1.0   NaN    NaN  DELTA
1     #    NaN   2.0    5.0   BETA
1     #    NaN   2.0    5.0  GAMMA
2     #    NaN   NaN    3.0  GAMMA
3     #    8.0   4.0    5.0  DELTA
3     #    8.0   4.0    5.0   BETA
3     #    8.0   4.0    5.0  GAMMA
Corralien
  • 109,409
  • 8
  • 28
  • 52