How to compare two cell in each pandas DataFrame row and set result in new cell in same row

Question

In pandas DataFrame, i need to create result column, that is result of comparison of two columns.

If in column SECOND - NaN, then in result column - FIRST.

If in column SECOND - not NaN, then in result column - SECOND.

	first	second
0	test	Nan
1	test2	test3

->

	result	first	second
0	test	test	Nan
1	test3	test2	test3

df = pandas.read_csv(file)
df["result"] = df["second"] if df["second"] else df["first"]

Please provide enough code so others can better understand or reproduce the problem. — Community, May 26 '22 at 10:08

score 0 · Answer 1 · answered May 26 '22 at 10:22

0

import pandas as pd
import numpy as np
df = pd.DataFrame({
    'first': ['A', 'C'],
    'second': ['B', None]
})
df['result'] = np.where(
    df['second'].isna(),
    df['first'],
    df['second']
)

answered May 26 '22 at 10:22

ThSorn

507
4
7

1

Please don't post code-only answers. Future readers will be grateful to see explained *why* this answers the question instead of having to infer it from the code. – Gert Arnold May 26 '22 at 18:52

Tom B. · Answer 2 · 2022-05-26T10:42:49.627

You can do this using the pandas apply function.

If your dataframe is,

df = pd.DataFrame({'first' : ['test', 'test2'], 'second' : [np.nan, 'test3']})
print(df)

    first   second
0   test    NaN
1   test2   test3

You can then use the apply function to create the 'result' column based on the other two columns:

df['result'] = df.apply(lambda row: row['first'] if pd.isna(row['second']) 
                                    else row['second'],
               axis=1)

The important part is axis=1 as that tells apply to use rows rather than columns.

If you'd prefer to not use a messy looking lambda function, the following code does the same thing:

def func(row):
    if pd.isna(row['second']):
        return row['first']
    else:
        return row['second']
df['result'] = df.apply(lambda row: func(row), axis=1)

Both of these produce the following dataframe:

    first   second  result
0   test    NaN     test
1   test2   test3   test3

How to compare two cell in each pandas DataFrame row and set result in new cell in same row

2 Answers2