0

I have two dataframes that I concatenate, and I change the names of the second one to match the first one right after, I would like to know if there was a possibility on Python to merge two duplicated columns? (The idea is that the Nans are replaced for all the identical columns if the value is non-null in the other one.)

Precision: I know I could change the name of my columns and then concatenate, but this leads to an index error that I can't solve

Example:

Info_r1_c1 Info_r1_c1
nan        nan
nan        198
300        nan
nan        nan
600        nan
nan        460
6.9        nan

And I would like this result, a remaining column with replaced values for all identical columns

Info_r1_c1
nan 
198
300
nan
600
460
6.9 

Thx for any help because I'm really stuck with this duplicated things

Ouhla
  • 39
  • 4

1 Answers1

0

Probably, a simple apply-lambda should do the job... Try this...

df = pd.DataFrame({"Info_r1_c1":[np.nan,20,30,np.nan],
                    "Info_r1_c2":[10,np.nan,np.nan,40]})
df.columns = ["Info_r1_c1","Info_r1_c1"]
dup_col_name = "Info_r1_c1" # Edit this variable which is duplicated in df
df["Info_r1_c1_Final"] = df[dup_col_name].apply(lambda x: list(x)[0] if pd.isna(list(x)[1]) else list(x)[1],axis=1)

# Output of df
   Info_r1_c1  Info_r1_c1  Info_r1_c1_Final
0         NaN        10.0              10.0
1        20.0         NaN              20.0
2        30.0         NaN              30.0
3         NaN        40.0              40.0

Hope this Helps...

Sachin Kohli
  • 1,956
  • 1
  • 1
  • 6