0

Suppose i have a column with values(not column name) L1 xyy, L2 yyy, L3 abc, now i want to group L1, L2 and L3 as L(or any other name also would do). Similarly i have other values like A1 xxx, A2 xxx, to be grouped form A and so on for other alphabets. How do i achieve this in pandas? I have L1, A1 and so on all in same column, and not different columns.

anky
  • 74,114
  • 11
  • 41
  • 70
Ganesh Jadhav
  • 712
  • 6
  • 22

1 Answers1

2

Use indexing by str[0] for return first letter of column and then aggregate some function, e.g. sum:

df = pd.DataFrame({'col':['L1 xyy','L2 yyy','L3 abc','A1 xxx','A2 xxx'],
                   'val':[2,3,5,1,2]})
print (df)
      col  val
0  L1 xyy    2
1  L2 yyy    3
2  L3 abc    5
3  A1 xxx    1
4  A2 xxx    2

df1 = df.groupby(df['col'].str[0])['val'].sum().reset_index(name='new')
print (df1)
  col  new
0   A    3
1   L   10

If need new column by first value:

df['new'] = df['col'].str[0]
print (df)
      col  val new
0  L1 xyy    2   L
1  L2 yyy    3   L
2  L3 abc    5   L
3  A1 xxx    1   A
4  A2 xxx    2   A
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252