Data Manipulation - Get data from another column if `nan`

Question

I have a Pandas DataFrame of 23 columns and 1119 rows.

Here is the issue, columns 13, 14, 20 and 21 are of float dtype.

If data in column 13 and 14 is nan, then they are present in 20 and 21, and vice versa.

I want to create a column, if value is missing, get from the other.

Example: column 13 and 14 is nan then get value from 20 and 21.

Here is what I came up with, I created a function and iterated using itertuples

def AP_calc(df):
    for i in df.itertuples():
        if i[20]==np.nan & i[21]==np.nan:
           pool = i[13] + i[14]
        else:
            pool = i[20] + i[21]
        return pool

then used an apply function but this does not work.

df["test"] = df[['AP in %','AP_M in %','FixP in €','FixP C in €']].apply(AP_calc,axis=1)

I have tried other methods too but not working, please help me out, please

share a sample of the dataframe, with only the four affected columns. https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples — sammywemmy, Feb 21 '20 at 06:07
You likely did not read @sammywemmy's link before you posted that, because that is not what was asked for. Please also read [Why not upload images of code on SO when asking a question?](https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-on-so-when-asking-a-question) to see why. — Amadan, Feb 21 '20 at 06:14

jezrael · Answer 1 · 2020-02-21T06:21:28.790

1

Use numpy.where with mask created by Series.isna:

m = df['FixP in €'].isna() & df['FixP C in €'].isna()
df["test"] = np.where(m, df['AP in %'] + df['AP_M in %'], df['FixP in €'] + df['FixP C in €'])

Or:

c1 = ['FixP in €','FixP C in €']
c2 = ['AP in %','AP_M in %']

m = df[c2].isna().all(axis=1)
df["test"] = np.where(m, df[c1].sum(axis=1), df[c2].sum(axis=1))

Alternative with selecting by positions with DataFrame.iloc:

c1 = [20,21]
c2 = [13,14]

m = df.iloc[:, c2].isna().all(axis=1)
df["test"] = np.where(m, df.iloc[:, c1].sum(axis=1), df.iloc[:, c2].sum(axis=1))

edited Feb 21 '20 at 06:21

answered Feb 21 '20 at 06:15

jezrael

822,522
95
1,334
1,252

Is there a way I could contact you? – JA-pythonista Feb 21 '20 at 06:23
@A.JT - I have email in my profile. Only run code for email. – jezrael Feb 21 '20 at 06:24
Just did, check your mailbox – JA-pythonista Feb 21 '20 at 06:27
Is there a way I could add € for the ones coming from df['FixP in €'] + df['FixP C in €'] and % from the ones coming from df['AP in %'] + df['AP_M in %'] – JA-pythonista Feb 21 '20 at 06:49
@A.JT - Not sure if understand, do you need change number e.g. `20` to `20€` ? – jezrael Feb 21 '20 at 06:53
exactly, after adding. Add € to the additions on the columns: 'FixP in €' and FixP C in € and % in the other two. – JA-pythonista Feb 21 '20 at 07:07
@A.JT - there are 3 solutions, what use with real data? – jezrael Feb 21 '20 at 07:08
To the existing data I have. those ones with € are in Euro and the other column is in percentage – JA-pythonista Feb 21 '20 at 07:10
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/208231/discussion-between-jezrael-and-a-j-t). – jezrael Feb 21 '20 at 07:13

score 0 · Answer 2 · answered Feb 21 '20 at 06:15

Just let the columns fill the N/A places of each other:

import pandas as pd
import numpy as np
df = pd.DataFrame({'Bob': [10, 30, np.nan], 
                   'Alice': [13, np.nan, 40]})

df['Bob'].fillna(df['Alice'], inplace=True)
df['Alice'].fillna(df['Bob'], inplace=True)

Data Manipulation - Get data from another column if `nan`

2 Answers2