I've a column which has these type of values:
'group_activity', 'dairy_revenue_freq', 'dairy_revenue',
'livestock_revenue_freq', 'livestock_revenue',
'poultry_revenue_freq', 'poultry_revenue',
'fruitsandveg_revenue_freq', 'fruitsandveg_revenue',
'cereals_revenue_freq', 'cereals_revenue',
'cashcrops_revenue_freq', 'cashcrops_revenue',
'aquaculture_revenue_freq', 'aquaculture_revenue',
'roots_revenue_freq', 'roots_revenue', 'other_farming',
'other_agric_revenue_freq', 'other_agric_revenue',
'dairy_trading_revenue_freq', 'dairy_trading_revenue',
'livestock_trading_revenue_freq', 'livestock_trading_revenue',
'poultry_trading_revenue_freq', 'poultry_trading_revenue',
'horticulture_trading_revenue_freq',
'horticulture_trading_revenue', 'cereals_trading_revenue_freq',
'cereals_trading_revenue', 'cashcrops_trading_revenue_freq',
'cashcrops_trading_revenue', 'fish_trading_revenue_freq',
'fish_trading_revenue', 'other_trading_business',
'other_trading_revenue_freq', 'other_trading_revenue',
'retailing_footwear_revenue_freq', 'retailing_footwear_revenue',
'retailing_clothes_revenue_freq', 'retailing_clothes_revenue',
'retail_kiosk_revenue_freq', 'retail_kiosk_revenue',
'agrovet_revenue_freq', 'agrovet_revenue',
'handicraft_revenue_freq', 'handicraft_revenue',
'furniture_revenue_freq', 'furniture_revenue',
'other_retailing_business', 'other_retailing_revenue_freq'...
I ant to split these string based on _
and only keep the last words i.e. activity
, freq
, revenue
.
Here is what I'm doing:
df1['a'] = df1['c'].str.split('_', 2, expand=True)[1]
But it is not giving the right result for all the value as the number of times _
appears varies.