3

I have 2 columns, which we'll call x and y. I want to create a new column called xy:

x    y    xy
1         1
2         2

     4    4
     8    8

There shouldn't be any conflicting values, but if there are, y takes precedence. If it makes the solution easier, you can assume that x will always be NaN where y has a value.

Chris Tang
  • 567
  • 7
  • 18
JesusMonroe
  • 1,421
  • 3
  • 13
  • 20

3 Answers3

4

it could be quite simple if your example is accurate

df.fillna(0)      #if the blanks are nan will need this line first
df['xy']=df['x']+df['y']
SuperStew
  • 2,857
  • 2
  • 15
  • 27
3

Notice your column type right now is string not numeric anymore

df = df.apply(lambda x : pd.to_numeric(x, errors='coerce'))

df['xy'] = df.sum(1)

More

df['xy'] =df[['x','y']].astype(str).apply(''.join,1)

#df[['x','y']].astype(str).apply(''.join,1)
Out[655]: 
0    1.0
1    2.0
2       
3    4.0
4    8.0
dtype: object
BENY
  • 317,841
  • 20
  • 164
  • 234
0

You can also use NumPy:

import pandas as pd, numpy as np

df = pd.DataFrame({'x': [1, 2, np.nan, np.nan],
                   'y': [np.nan, np.nan, 4, 8]})

arr = df.values
df['xy'] = arr[~np.isnan(arr)].astype(int)

print(df)

     x    y  xy
0  1.0  NaN   1
1  2.0  NaN   2
2  NaN  4.0   4
3  NaN  8.0   8
jpp
  • 159,742
  • 34
  • 281
  • 339