2

I tried to replicate this approach to find the means for different groups in my dataset: Means multiple columns by multiple groups and the following code:

newtest %>%
  group_by(aligntool, paired) %>%
  summarise(vars("read_per_length"), mean)

However, I get the following error message:

In format.data.frame(x, digits = digits, na.encode = FALSE) :
  corrupt data frame: columns will be truncated or padded with NAs

I tested to see if this was a problem with zero values, so I removed those and got the same problem. I also made the dataset smaller to see if this was a memory issue. For reference, my dataframe looks like this:

str(newtest)
'data.frame':   100 obs. of  4 variables:
 $ Run_Sample     : Factor w/ 6 levels "Run_1768_Sample_77304",..: 5 6 3 3 4 6 2 1 6 6 ...
 $ paired         : Factor w/ 2 levels "N","Y": 2 2 1 1 1 1 1 2 2 1 ...
 $ aligntool      : Factor w/ 2 levels "bbmap","kallisto": 2 1 1 2 1 1 2 2 1 1 ...
 $ read_per_length: num  2.60e-10 1.87e-09 3.28e-09 7.63e-10 1.38e-09 ...

Is there a problem in how my dataframe is formatted somehow? How do I resolve this issue?

nghauran
  • 6,648
  • 2
  • 20
  • 29
Emily
  • 25
  • 6
  • 4
    Use `summarize_at()` instead of `summarize()`. – OTStats Dec 18 '18 at 16:07
  • 3
    @OTStats - I suspect that others may also make this same mistake and could benefit from it being answered here. Perhaps consider making your comment into an answer? – dww Dec 18 '18 at 16:19
  • 1
    @dww I'm not sure it works, hoping will try it. – OTStats Dec 18 '18 at 16:44
  • My other problem was how I installed packages. I individually installed dplyr/tidyr instead of 'install.packages("tidyverse")' This solved some dependency issues. – Emily Jan 23 '19 at 20:09

1 Answers1

0

This should work:

newtest %>%
  group_by(aligntool, paired) %>%
  summarise_at(vars("read_per_length"), mean)
dmca
  • 675
  • 1
  • 8
  • 18