-1

I have a huge data set that containts 30000 columns full with data. I want to take one row and plot the means of sets of 100 (the first 100 entries, the second 100 entries, and so on), so a total of 300 entries. I have the script for the plot ready, but I can't figure out how to divide my data into sets of 100.

Can anybody of you help? Thank you

qwerty
  • 2,392
  • 3
  • 30
  • 55
  • 1
    Share code that you have tried. – qwerty Nov 27 '17 at 11:56
  • The function I want to apply is CV <- function(x, ...){(sd(x, ...)/mean(x, ...))*100} and I've tried something like byapply(DataSet$column., rep(1:30000, each = 100), rowMeans) but this totally did not work – Anya Drake Nov 28 '17 at 10:39

1 Answers1

0

It may be easier to melt the data, add a column identifier (1:300, repeated 100 times each), and then summarize by that column.

So something like:

library(dplyr)
df <- df %>%
   gather(Key, Value) %>%
   mutate(ID = rep(1:300, each = 100)) %>%
   group_by(Key, ID) %>%
   summarize(Mean = mean(Value))

ggplot(df) + 
   geom_point(aes(x = ID, y = Mean))

You'll have to customize the code, since I don't have the data structure...

user2602640
  • 640
  • 6
  • 21