2

I have a dataframe with four columns: ID, Phone1, Phone2, and Phone3. I would like to create a new dataframe with three columns: ID, Phone, PhoneSource. If I do an append as in this question:

df['Column 1'].append(df['Column 2']).reset_index(drop=True)

I obtain half of what I want: all the Phone numbers are in the same column. But how do I keep the source?

Community
  • 1
  • 1
Mike Atomat
  • 93
  • 3
  • 11
  • 1
    can you post raw data and code to create your 2 dfs and what the desired output looks like to avoid ambiguity – EdChum Nov 01 '16 at 09:26
  • I see your point @EdChum, but I thought that the previous question I linked was giving some context. Lazy on my part though, I admit! – Mike Atomat Nov 01 '16 at 09:48
  • Well the form here is to post code that reproduces your problem and the desired result, linking to a related question doesn't necessarily inform us why that question didn't quite fully resolve your particular example hence why you should do this – EdChum Nov 01 '16 at 09:51

1 Answers1

3

I think you can use melt:

df = pd.DataFrame({'ID':[2,3,4,5],
                   'Phone 1':['A', 'B', 'C', 'D'],
                   'Phone 2':['E', 'F', 'G', 'H'],
                   'Phone 3':['A', 'C', 'G', 'H']})
print (df)
   ID Phone 1 Phone 2 Phone 3
0   2       A       E       A
1   3       B       F       C
2   4       C       G       G
3   5       D       H       H

print (pd.melt(df, id_vars='ID', var_name='PhoneSource', value_name='Phone'))
    ID PhoneSource Phone
0    2     Phone 1     A
1    3     Phone 1     B
2    4     Phone 1     C
3    5     Phone 1     D
4    2     Phone 2     E
5    3     Phone 2     F
6    4     Phone 2     G
7    5     Phone 2     H
8    2     Phone 3     A
9    3     Phone 3     C
10   4     Phone 3     G
11   5     Phone 3     H

Another solution with stack:

df1 = df.set_index('ID').stack().reset_index()
df1.columns = ['ID','PhoneSource','Phone']
print (df1)
    ID PhoneSource Phone
0    2     Phone 1     A
1    2     Phone 2     E
2    2     Phone 3     A
3    3     Phone 1     B
4    3     Phone 2     F
5    3     Phone 3     C
6    4     Phone 1     C
7    4     Phone 2     G
8    4     Phone 3     G
9    5     Phone 1     D
10   5     Phone 2     H
11   5     Phone 3     H
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks a lot! I had never realised there was a melt function, I only knew it from R. This is amazing :) I'll also look closely to the stacking answer, because I never had to use it. But at the moment the melt works and it's fantastic! – Mike Atomat Nov 01 '16 at 09:47
  • any idea if I could go back to a one liner? I have merged it with another column and would like now to have 3 columns for the phones and this new column. I've tried unstack but it doesn't really work. Thanks! – Mike Atomat Nov 12 '16 at 22:59
  • You can check [this solution](http://stackoverflow.com/a/40529872/2901002) – jezrael Nov 13 '16 at 08:34
  • Wow perfect, that helps a lot @jezrael! Thanks – Mike Atomat Nov 14 '16 at 09:40
  • True! It's nice seeing unstack :) – Mike Atomat Nov 14 '16 at 09:57
  • Thank you. small advice - I think you can add `numpy` tag to http://stackoverflow.com/questions/40585377/pandas-identify-unique-triplets-from-a-df – jezrael Nov 14 '16 at 09:58