1

I have a dataframe which is made of 5 attributes (e.g. Plot, weight, date etc) plus 2000 spectral values (different wavelengths). Thus, there around 2005 columns.

For these columns, there are 120 measurements (24 objects * measured 5 times). I am keen to average the reflectance (wavelenghts) values (2000 columns) based on a attribute (i.e. plot).

I am thinking about using dplyr package and a pipeline:

DF &>%
   group_by(Plot) %>%
   aggregate(... *I am stuck here*

End goal is to have a dataframe with 2005 columns, 24 rows (each row consisting of the original metadata + the average value for each wavelength*based on the plot number).

Thanks

kath
  • 7,624
  • 17
  • 32
Gustavo TA
  • 27
  • 3
  • 1
    Welcome to SO! Please provide a [minimal, reproducible example](https://stackoverflow.com/a/5963610/5892059) for example with `dput()` You don't have to include all your columns, but the structure should be visible – kath Sep 03 '18 at 14:14

1 Answers1

0

For a similar task, I usually use the summarize_all function from dplyr. You need to drop all columns that cannot be averaged (like if your replicates have different dates), group_by the remaining metadata. Something like :

DF %>%
  select(-Date) %>% # remove those metadata columns that cannot be averaged
  group_by(Plot) %>%
  summarize_all(mean)
Soeren D.
  • 312
  • 1
  • 7