Suppose i have a column with values(not column name) L1 xyy, L2 yyy, L3 abc,
now i want to group L1, L2 and L3
as L(or any other name also would do).
Similarly i have other values like A1 xxx, A2 xxx, to be grouped form A and so on for other alphabets.
How do i achieve this in pandas?
I have L1, A1 and so on all in same column, and not different columns.
Asked
Active
Viewed 728 times
0

anky
- 74,114
- 11
- 41
- 70

Ganesh Jadhav
- 712
- 6
- 22
-
can you make a sample dataframe and an expected output? also is this in one column or all df columns – anky Mar 14 '19 at 11:45
-
Just one column. – Ganesh Jadhav Mar 14 '19 at 16:32
-
check the below answer by jez. – anky Mar 14 '19 at 16:34
-
1Yeah, that's what I wanted. – Ganesh Jadhav Mar 14 '19 at 16:37
-
cool. for future reference future, please take a look at [this](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – anky Mar 14 '19 at 16:38
1 Answers
2
Use indexing by str[0]
for return first letter of column and then aggregate some function, e.g. sum
:
df = pd.DataFrame({'col':['L1 xyy','L2 yyy','L3 abc','A1 xxx','A2 xxx'],
'val':[2,3,5,1,2]})
print (df)
col val
0 L1 xyy 2
1 L2 yyy 3
2 L3 abc 5
3 A1 xxx 1
4 A2 xxx 2
df1 = df.groupby(df['col'].str[0])['val'].sum().reset_index(name='new')
print (df1)
col new
0 A 3
1 L 10
If need new column by first value:
df['new'] = df['col'].str[0]
print (df)
col val new
0 L1 xyy 2 L
1 L2 yyy 3 L
2 L3 abc 5 L
3 A1 xxx 1 A
4 A2 xxx 2 A

jezrael
- 822,522
- 95
- 1,334
- 1,252