R: Dataframe Manipulation (group_by/aggregate)

Question

I have the following data frame as shown below

Code	Type	Fee
AA	JSON	10
AA	R	20
AA	R	10
DD	PYTHON	30
DD	JSON	15
ZZ	R	50
ZZ	PYTHON	60
ZZ	PYTHON	10

code = c(rep("AA",3), rep("DD",2), rep("ZZ",3))
Type = c("JSON", "R", "R", "PYTHON", "JSON", "R", "PYTHON", "PYTHON")
Fee = c(10,20,10,30,15,50,60,10)
Price = as.data.frame(cbind(code, Type, Fee)); Price$Fee = as.integer(as.character(Price$Fee))

I would like to change the data frame to the one shown below with the Fee summed up accordingly

	JSON	R	PYTHON
AA	10	30	0
DD	15	0	30
ZZ	0	50	70

Thanks in advance!

If you use `Price = data.frame(code, Type, Fee)` to construct the dataframe, you would not need `Price$Fee = as.integer(as.character(Price$Fee))` — Ronak Shah, Jul 25 '21 at 06:18
FYI there are mechanisms in place for when questions have already been asked other than deleting the content of the question, which is why I've rolled back your edit — camille, Jul 26 '21 at 20:13

score 1 · Accepted Answer · answered Jul 25 '21 at 06:17

We turn the data.frame from long to wide, the id_cols is the code (since it is in the rows), the names of columns are taken from the Type column, and finally values are taken from Fee.

Since there are duplicates we need a method to sum values (for example ZZ python appears twice), we use the sum function.

Finally, we remove NAs and replace them with 0.

 library(tidyvesre) 

> new_dat <- Price %>% pivot_wider(id_cols = code, names_from = Type, 
                                   values_from = Fee, values_fn = sum, values_fill = 0)

# A tibble: 3 x 4
  code   JSON     R PYTHON
  <chr> <int> <int>  <int>
1 AA       10    30      0
2 DD       15     0     30
3 ZZ        0    50     70

score -1 · Answer 2 · answered Jul 25 '21 at 06:12

You can use pivot_wider from tidyr -

tidyr::pivot_wider(Price, names_from = Type, values_from = Fee, 
                   values_fn = sum, values_fill = 0)

#  code   JSON     R PYTHON
#  <chr> <int> <int>  <int>
#1 AA       10    30      0
#2 DD       15     0     30
#3 ZZ        0    50     70

And with data.table dcast -

library(data.table)

dcast(setDT(Price), Type ~ code, value.var = 'Fee', fun.aggregate = sum)

R: Dataframe Manipulation (group_by/aggregate)

2 Answers2