-1

Within my dataset there are many rows that are code markers and then many columns i believe we're only interested in column 2-96 as column 1 is the code markers names. 2-50 is c3 repititions and 50 - 96 is c4. For each code marker i am meant to produce a mean and sd from their c3 and c4 repetitions so the out put is a mean and sd of c3 and c4 for every code marker (row name). the column names are not just c3 for all the c3 repetitions it goes c3_1, c3_1.1, etc. I would like it to be one function for both sd and mean for both c3 and c4. I was thinking that probably means using the column number would work better (as mention earlier).

I managed to create this code however it does what i need but for columns not rows, is this a simple fix?

df1[,lapply(.SD, function(x) return(c(mean(x, na.rm = TRUE), sd(x, na.rm = TRUE)))), .SDcols = colnames(df1)[2:6]]

ash7on
  • 39
  • 5
  • 3
    can you try and follow these guidelines to help us answer your question https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example - cheers – user438383 Apr 19 '20 at 19:37
  • Can you show the 'df1'. Also, what are those x1.0 and x1.1. Based on your descriptioin, I guess you want to split into two groups 'c3' and 'c4' – akrun Apr 19 '20 at 20:12
  • yeah i want split into c3 and c4 but was demonstrating that all the columns names are different and ```df1 <- data.table(df) ``` – ash7on Apr 19 '20 at 20:18
  • How did you read the dataset. Perhaps you have used `header = FALSE` in `read.csv/read.table`? because the column names are first row now and it created `factor` columns – akrun Apr 19 '20 at 20:54
  • You can check my updated code – akrun Apr 19 '20 at 21:00
  • the `str` you updated shows may be it is a `dat.a.table` with column names as numbers? – akrun Apr 19 '20 at 22:04
  • Did you convert to `character` class while setting the column name `as.character(unlist(df1[1, -1]))` – akrun Apr 19 '20 at 22:05
  • Please try from your original dataset without the conversion. I think you made a lot of changes by converting to data.table etc. Please start on a fresh R session. Load the data as data.frame (no data.table conversion) and apply the code. Also, when you use `read.csv`, use `stringsAsFactors = FALSE` and `skip = 1` – akrun Apr 19 '20 at 22:13

1 Answers1

2

We could do

library(dplyr)
iris %>%
 summarise_at(vars(Sepal.Length), list(mean = ~mean(., na.rm = TRUE),
     sd = ~sd(., na.rm = TRUE)))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/212050/discussion-on-answer-by-akrun-can-someone-help-me-calculate-mean-and-sd-for-all). – Samuel Liew Apr 20 '20 at 05:09