I have a very wide dataframe: >80 columns.
I would like to aggregate over some of the columns on the left, applying paste0
over the other columns:
prov_solicitud expediente Puntos AR16_09 BA16_09 BA11_08 BA17_09 BA22_08
Vigo BS607A 2014/1-5 65 <NA> <NA> <NA> <NA> <NA>
A Coruña BS607A 2014/10-1 42 <NA> 1 <NA> <NA> <NA>
Lugo BS607A 2014/10-2 10 <NA> <NA> - <NA> O
Lugo BS607A 2014/10-2 10 <NA> 2 <NA> <NA> <NA>
Vigo BS607A 2014/10-5 34 <NA> E <NA> <NA> <NA>
Lugo BS607A 2014/100-2 29 <NA> <NA> <NA> <NA> <NA>
dim(tbl)
> [1] 491 81
Having less columns, I would do it with dplyr:
(in this example there are only 5 data columns to paste)
tbl %.% group_by(prov_solicitud, expediente, Puntos) %.%
summarise(AR16_09=paste0(AR16_09), BA16_09=paste0(BA16_09),
BA11_08=paste0(BA11_08), BA17_09=paste0(BA17_09),
BA22_08=paste0(BA22_08))
How could I do it without having to type all the column names?
Maybe using by
or aggregate
and a formula like prov_solicitud + expediente + Puntos ~ .
.
Would it be useful to use as.formula
. Is there a simpler way?
Probably it would be neccesary to convert all NA
to ""
in the data columns.
And I would like to maintain the same column names.