How to groupby in pandas where i have column values starting with similar letters

Question

Suppose i have a column with values(not column name) L1 xyy, L2 yyy, L3 abc, now i want to group L1, L2 and L3 as L(or any other name also would do). Similarly i have other values like A1 xxx, A2 xxx, to be grouped form A and so on for other alphabets. How do i achieve this in pandas? I have L1, A1 and so on all in same column, and not different columns.

can you make a sample dataframe and an expected output? also is this in one column or all df columns — anky, Mar 14 '19 at 11:45
cool. for future reference future, please take a look at [this](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — anky, Mar 14 '19 at 16:38

jezrael · Accepted Answer · 2019-03-14T11:52:39.333

Use indexing by str[0] for return first letter of column and then aggregate some function, e.g. sum:

df = pd.DataFrame({'col':['L1 xyy','L2 yyy','L3 abc','A1 xxx','A2 xxx'],
                   'val':[2,3,5,1,2]})
print (df)
      col  val
0  L1 xyy    2
1  L2 yyy    3
2  L3 abc    5
3  A1 xxx    1
4  A2 xxx    2

df1 = df.groupby(df['col'].str[0])['val'].sum().reset_index(name='new')
print (df1)
  col  new
0   A    3
1   L   10

If need new column by first value:

df['new'] = df['col'].str[0]
print (df)
      col  val new
0  L1 xyy    2   L
1  L2 yyy    3   L
2  L3 abc    5   L
3  A1 xxx    1   A
4  A2 xxx    2   A

How to groupby in pandas where i have column values starting with similar letters

1 Answers1