How to create a new column from common elements of other multiple columns?

Question

I have been looking for a solution and trying out different approaches to solve the problem in the title but haven't succeeded unfortunately.

Here is an example of what I would like to do. Say you have a dataframe with four columns:

df = pd.DataFrame(
    {
        "c1": ["A", "B", "C", "D", "E", "F"],
        "c2": ["A", "B", "X", "D", "C", "E"],
        "c3": ["B", "C", "A", "X", "E", "G"],
        "c4": ["D", "E", "B", "C", "X", "A"],
    },
    index=[0, 1, 2, 3, 4, 5],
)

Now I would like to create a fifth column that will contain only the common elements of the other four:

df = pd.DataFrame(
{
        "c1": ["A", "B", "C", "D", "E", "F"],
        "c2": ["A", "B", "X", "D", "C", "E"],
        "c3": ["B", "C", "A", "X", "E", "G"],
        "c4": ["D", "E", "B", "C", "X", "A"],
        "c5": ["A", "B", "C", "E", NaN, NaN]
    },
    index=[0, 1, 2, 3, 4, 5],
)

In other words, if an element is in multiple columns from c1 to c4, include it in column c5.

This should be possible by using the merge function, but as I said haven't succeeded.

please explain your logic, what do you mean by *that will contain only the common elements of the other four* — anky, May 13 '21 at 19:25
@ScottBoston the index is not really important in the first step. It can be reset after having the desired column. — Bora, May 13 '21 at 19:51
"*This should be possible by using the merge function, but as I said haven't succeeded.*" This sentence seems out of place/unnecessary. *Why* do you think it should be possible using the `merge` function? — TylerH, May 13 '21 at 20:21
Why do you want to create another column? What is the relationship of this variable with the other ones? Why can't it be on another dataframe? — Braiam, May 13 '21 at 20:42
Possible duplicate https://stackoverflow.com/q/46556169/792066 — Braiam, May 13 '21 at 20:53

score 1 · Accepted Answer · answered May 13 '21 at 19:28

You can use functools reduce:

from functools import reduce
import numpy as np

df['col5'] = pd.Series(reduce(lambda x,y: (np.intersect1d(x, y)), df.T.values))

RESULT:

  c1 c2 c3 c4 col5
0  A  A  B  D    A
1  B  B  C  E    B
2  C  X  A  B    C
3  D  D  X  C    E
4  E  C  E  X  NaN
5  F  E  G  A  NaN

How to create a new column from common elements of other multiple columns?

1 Answers1