-1

Here is the data frame:

Name| File amount| other amount | other name
A      123         48             a
B      456         48             a
C      789         49             a
A                  48             b
A                  48             c
B                  48             d

What I want is the following:

Name is the ID and each name must have a column 'file amount', and what I want is group all the lines together based on Name, and put 'other name' on the right side if they have more than 1 column, add on the right side. and leave blank if they don't have the 'other name'.

Name| File amount| other_1| other_1 amount| other_2| other_2 amount| other_3| other_3 amount
A     123          a         48             b            48           c         48
B     456          a         48             d            48
C     789          a         49
MoRe
  • 2,296
  • 2
  • 3
  • 23
  • 1
    `df2 = (df.assign(col=df.groupby('Name').cumcount().add(1)) .pivot(index='Name', columns='col') ) ; df2.columns = df2.columns.map(lambda x: f'{x[0]}_{x[1]}')` – mozway Jul 11 '22 at 04:06

2 Answers2

0
df2 = df.groupby('Name').agg({
    'File amount': 'first',
    'other name': list,
    'other amount': list,
})
df3 = pd.concat([df2['File amount'], pd.DataFrame(df2['other name'].tolist(), index=df2.index).rename(columns=lambda x: 'other_' + str(x + 1)),
pd.DataFrame(df2['other amount'].tolist(), index=df2.index).rename(columns=lambda x: 'other_' + str(x + 1) + ' amount')],axis=1)
df3 = df3.reindex(sorted(df3.columns, key=lambda x: x.split(' ')[0]),axis=1)
MoRe
  • 2,296
  • 2
  • 3
  • 23
  • what if add another key 'Name_1', how to do that? groupby(['Name', 'Name_1']) – huo shankou Jul 11 '22 at 03:16
  • If 'other name' 'other amount' they have null values, when you do the transformation, the null value will also occupy one column, that will incur that extra columns are created that are not needed. – huo shankou Jul 11 '22 at 03:18
0

One more solution here!

other_name_cols = sum([[f"other_{i+1}", f"other_{i+1} amount"] for i in range(df.Name.value_counts().max())], list())
df_rs = df[['Name', 'File amount']].dropna().set_index('Name')
df_rs[other_name_cols] = None
for i, name in enumerate(df_rs.index):
    records = [x for y in df[df['Name'] == name][["other amount", "other name"]].values for x in y]
    df_rs.iloc[i, 1: 1+len(records)] = records
Lazyer
  • 917
  • 1
  • 6
  • 16