I have dataframe tech_change_supply
that looks like this:
supplysector tech_change_all tech_change_elec tech_change_no_elec
aluminum NA .26 .77
cement NA .86 .92
chemicals NA 1.32 NA
food processing NA .39 .70
iron and steel NA .02 .58
other manufacturing NA 1.11 1.04
other nonmetallic NA .41 .75
pulp paper wood NA .96 NA
and dataframe tech_change_service
that looks like this:
subsector tech_change_all tech_change_elec tech_change_no_elec
boilers 0 0.0 1.0
boilers_CHP 0 0.0 1.0
electrochemical 0 1.0 0.0
feeedstocks 0 0.0 0.0
machine drive 0 1.0 0.0
process heat 0 0.0 1.0
other uses 0 0.5 0.5
I need to combine these two dataframes in a way so that there is a supplysector column from tech_change_supply
and a subsector column from tech_change_service
, where each supplysector has every subsector (i.e. "aluminum" is repeated 7 times with each subsector matched to it once). Additionally, I need to maintain the tech_change_all, tech_change_elec, and tech_change_no_elec, and multiply these using the values from each supplysector, with the exception being "other uses" where I need the average of tech_change_elec and tech_change_no_elec. For example:
supplysector subsector tech_change_all tech_change_elec tech_change_no_elec
aluminum boilers 0 0 .77
aluminum boilers_CHP 0 0 .77
aluminum electrochemical 0 .26 0
aluminum feedstocks 0 0 0
aluminum machine drive 0 .26 0
aluminum process heat 0 0 .77
aluminum other uses 0 .515 .515
this also needs to be done using base R or dplyr 0.8.1, tidyr 0.8.3. Thank you!
Data:
> dput(tech_change_supply)
structure(list(supplysector = structure(c(2L, 3L, 4L, 6L, 7L,
9L, 10L, 11L), .Label = c("agriculture", "aluminum and nonferrous metals",
"cement energy processes", "chemicals", "construction", "food processing",
"iron and steel", "mining", "other manufacturing", "other nonmetallic minerals",
"pulp paper and wood"), class = "factor"), tech_change_all = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_
), tech_change_elec = c(0.265280686, 0.856866327, 1.316191449,
0.390516499, 0.02304637, 1.114989892, 0.410722674, 0.956948382
), tech_change_no_elec = c(0.768341707, 0.915245624, NA, 0.6994352,
0.575833299, 1.040612635, 0.752058291, NA)), row.names = c(NA,
-8L), class = "data.frame")
> dput(tech_change_service)
structure(list(subsector = structure(c(2L, 3L, 5L, 6L, 7L, 10L,
9L), .Label = c("agriculture energy use", "boilers", "boilers_CHP",
"construction energy use", "electrochemical", "feedstocks", "machine drive",
"mining energy use", "other uses", "process heat"), class = "factor"),
tech_change_all = c(0L, 0L, 0L, 0L, 0L, 0L, 0L), tech_change_elec = c(0,
0, 1, 0, 1, 0, 0.5), tech_change_no_elec = c(1, 1, 0, 0,
0, 1, 0.5)), row.names = c(NA, -7L), class = "data.frame")