2

I will like to merge 2 columns into 1 column and remove nan.

I have this data:

     Name       A       B   
    Pikachu   2007    nan
    Pikachu   nan     2008
    Raichu    2007    nan
    Mew       nan     2018

Expected Result:

     Name     Year   
    Pikachu   2007   
    Pikachu   2008   
    Raichu    2007   
    Mew       2008 

Code I tried:

df['Year']= df['A','B'].astype(str).apply(''.join,1)

But my result is this:

 Name     Year   
Pikachu   2007nan   
Pikachu   nan2008   
Raichu    2007nan   
Mew       nan2008

4 Answers4

4

Use Series.fillna with DataFrame.pop for extract columns and last convert to integers:

df['Year']= df.pop('A').fillna(df.pop('B')).astype(int)

#if possible some missing values in Year column
#df['Year']= df.pop('A').fillna(df.pop('B')).astype('Int64')
print (df)
      Name  Year
0  Pikachu  2007
1  Pikachu  2008
2   Raichu  2007
3      Mew  2018
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

Could you please try following.

df['Year']=df['A'].combine_first(df['B'])
df

Output will be as follows.

     Name    A       B      Year
0   Pikachu 2007.0  NaN     2007.0
1   Pikachu NaN     2008.0  2008.0
2   Raichu  2007.0  NaN     2007.0
3   Mew     NaN     2018.0  2018.0


To get only Name and year columns in a new data frame try following.

df['Year']=df['A'].combine_first(df['B'])
df1=df[['Name','Year']]
df1
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
1
df = df.fillna(0)
df["Year"] = df["A"] + df["B"]
df = df[['Name','Year']]
Pyd
  • 6,017
  • 18
  • 52
  • 109
0

numpy.where could be usefull

df["A"] = np.where(df["A"].isna(), df["B"], df["A"]).astype("int")
df = df.drop("B", axis=1)
print(df)
    Name    Year
0   Pikachu 2007
1   Pikachu 2008
2   Raichu  2007
3   Mew     2018
Iron Hand Odin
  • 410
  • 4
  • 10