2

I currently have a dataframe which looks like this

User    Date     FeatureA FeatureB
John    DateA      1        2
John    DateB      3        5

Is there anyway that I can combine the 2 rows such that it becomes

  User    Date1    Date2    FeatureA1 FeatureB1 FeatureA2 FeatureB2
  John    DateA    DateB        1        2          3        5
Derrick Peh
  • 309
  • 2
  • 8
  • 20
  • Do you want to combine all rows that have the same `User`? Can you have more than two rows that have the same `User`? – DYZ Mar 21 '18 at 06:06
  • @DyZ I would like to combine all rows that have the same User such that the dataframe would only have unique Users – Derrick Peh Mar 21 '18 at 06:08
  • Do you have an answer to the second part of my question? – DYZ Mar 21 '18 at 06:12
  • @DyZ For the 2nd part of your question, I do not wish to have more than one row that have the same User – Derrick Peh Mar 21 '18 at 06:14
  • You misunderstood my question. Can the original data have _more than two_ rows that have the same `User`? – DYZ Mar 21 '18 at 06:15
  • Yes, the original data can have more than 2 rows of the same user – Derrick Peh Mar 21 '18 at 06:17
  • Possible duplicate of [Pandas long to wide reshape](https://stackoverflow.com/questions/22798934/pandas-long-to-wide-reshape) – DYZ Mar 21 '18 at 06:21

1 Answers1

2

I think need:

g = df.groupby(['User']).cumcount()
df = df.set_index(['User', g]).unstack()
df.columns = ['{}{}'.format(i, j+1) for i, j in df.columns]
df = df.reset_index()
print (df)
   User  Date1  Date2  FeatureA1  FeatureA2  FeatureB1  FeatureB2
0  John  DateA  DateB          1          3          2          5

Explanation:

  1. Get count per groups by Users with cumcount
  2. Create MultiIndex by set_index
  3. Reshape by unstack
  4. Flatenning MultiIndex in columns
  5. Convert index to columns by reset_index
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252