pandas pivot to limited list of column values

Question

I have a pandas dataframe

Type	Name
Cheese	Mozarella
Cheese	Cheddar
Cheese	Pepper Jack
Bread	Sourdough
Bread	Rye
Pepper	Green

I am trying to group by the Type column, collect the list of Names and split that into at most 2 columns. So the output would be:

Type	Name_primary	Name_alternative
Cheese	Mozarella	Cheddar
Bread	Sourdough	Rye
Pepper	Green

If a type has more than 2 Names, fill the first 2 for Name_primary and Name_alternative
If a type has just one Name, the Name_alternative will be null
Assumption : The first table has no duplicates and is already ordered in some way.

I got to the point of collecting list df.groupby("Type").Name.apply(list).to_frame() or df.pivot_table(index="Type", values="Name", aggfunc=list)

I could do a lambda apply on a user function that checks the length, and gets (val1, val2) or (val1, None) and create Name_Primary and Name_Alternative columns. But is there a simpler way to do this ?

Does this answer your question? Use the linked answer before your current code. [How to keep first two duplicates in a pandas dataframe?](https://stackoverflow.com/questions/32530856/how-to-keep-first-two-duplicates-in-a-pandas-dataframe) — G. Anderson, Mar 01 '23 at 17:33
@G.Anderson Not entirely. I am able to get the top 2 values by doing ```df.groupby("Type").Name.nth[:2].to_frame()``` Its the next part to make those possible 2 values in a group into 2 columns that I am trying to be smart about — newpyspark, Mar 01 '23 at 17:45

score 2 · Answer 1 · answered Mar 01 '23 at 17:50

Lets group the dataframe by type and create records corresponding to top two names inside a comprehension

c = ['Name_primary', 'Name_alternative']
df1 = pd.DataFrame({'Type': k, **dict(zip(c, g['Name']))} 
                   for k, g in df.groupby('type', sort=False))

Alternatively you can reshape the dataframe with pivot after selecting top two rows per Type

# create seq counter to identify unique rows per type
df['idx'] = df.groupby('type').cumcount()

# Filter rows where counter < 2 and pivot to reshape
df1 = df.query('idx < 2').pivot(index='type', columns='idx', values='Name')
df1.columns = ['Name_primary', 'Name_alternative']

Result

     Type Name_primary Name_alternative
0  Cheese    Mozarella          Cheddar
1   Bread    Sourdough              Rye
2  Pepper        Green              NaN

Thanks @Shubham. Comprehension within the group is a great suggestion — newpyspark, Mar 01 '23 at 17:56

pandas pivot to limited list of column values

1 Answers1