I have a dataframe combined_data
that looks like this (this is just an example):
Year state_name VoS_thousUSD industry
2008 Alabama 100 Shipping
2009 Alabama 100 Shipping
2008 Alabama 200 Shipping
2010 Alabama 100 Shipping
2010 Alabama 50 Shipping
2010 Alabama 100 Shipping
2008 Alabama 100 Shipping
There are multiple Year
, state_name
, and industry
variable, with associated VoS_thousUSD
values, as well as other columns I no longer need.
I am trying to produce this
Year state_name VoS_thousUSD industry
2008 Alabama 400 Shipping
2009 Alabama 100 Shipping
2010 Alabama 250 Shipping
Where the dataframe is grouped by Year
, state_name
, and industry
, and VoS_thousand
is a sum by those groups.
So far I have
combined_data %>%
group_by(Year, state_name, GCAM_industry) %>%
summarise() -> VoS_thousUSD_state_ind
But I am not sure how/where to add in the sum for VoS_thousUSD
. Would like to use a dplyr pipeline.