2

How to merge multiple column values into one column of same data frame and get new column with unique values.

 Column1  Column2  Column3  Column4  Column5
 0    a        1        2        3        4
 1    a        3        4        5
 2    b        6        7        8
 3    c        7        7        

Output:

Column A
a
a
b
c
1
3
6
7
2
4
5
8
Sociopath
  • 13,068
  • 19
  • 47
  • 75
Ayush
  • 95
  • 2
  • 8
  • Column1 Column2 Column3 Column4 Column5 0 a 1 2 3 4 1 a 3 4 5 2 b 6 7 8 3 c 7 7 – Ayush Sep 21 '18 at 07:04
  • Possible duplicate of [Merge multiple column values into one column in python pandas](https://stackoverflow.com/questions/33098383/merge-multiple-column-values-into-one-column-in-python-pandas) – Rahul Agarwal Sep 21 '18 at 07:06

2 Answers2

3

Use unstack or melt for reshape, remove missinf values by dropna and duplicates by drop_duplicates:

df1 = df.unstack().dropna().drop_duplicates().reset_index(drop=True).to_frame('A')

df1 = df.melt(value_name='A')[['A']].dropna().drop_duplicates().reset_index(drop=True)
print (df1)
    A
0   a
1   b
2   c
3   1
4   3
5   6
6   7
7   2
8   4
9   5
10  8
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

Here is another way to do it if you are ok using numpy. This will handle either nans or empty strings in the original dataframe and is a bit faster than unstack or melt.

import pandas as pd
import numpy as np

df = pd.DataFrame({'Column1': ['a', 'a', 'b', 'c'],
                   'Column2': [1, 3, 6, 7],
                   'Column3': [2, 4, 7, 7],
                   'Column4': [3, 5, 8, np.nan],
                   'Column5': [4, '', '', np.nan]})

u = pd.unique(df.values.flatten(order='F'))
u = u[np.where(~np.isin(u, ['']) & ~pd.isnull(u))[0]]
df1 = pd.DataFrame(u, columns=['A'])

print(df1)

    A
0   a
1   b
2   c
3   1
4   3
5   6
6   7
7   2
8   4
9   5
10  8
b2002
  • 914
  • 1
  • 6
  • 10