-4

I have a data frame with 3 columns and 16 rows. Each element has values like row1 values are (0.9, 0.9, 1.0), (0.7,0.9, 1.0), (0.9, 0.9, 1.0). I want element wise mean e.g., (0.9+0.7+0.9/3), (0.9+0.9+0.9/3), (1.0+1.0+1.0/3) and store the result as new column. Any suggestions?

      SHO1          SHO2             SHO3
1  0.7, 0.9, 1.0   0.9, 0.9, 1.0   0.7, 0.9, 1.0
2  0.7, 0.9, 1.0   0.9, 0.9, 1.0   0.7, 0.9, 1.0
3  0.0, 0.0, 0.1   0.9, 0.9, 1.0   0.0, 0.0, 0.1

expected out for row1:

0.7+0.9+0.7/3, 0.9+0.9+0.9/3, 1.0+1.0+1.0/3
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • Can you show an example of your data in a proper format? See [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – David Arenburg Aug 05 '15 at 11:20
  • We will need a `dput` of your data. Also, I think you are missing a parenthesis in each of these statements such as `(0.7+0.9+0.7)/3` – David Arenburg Aug 05 '15 at 11:33
  • Its giving an error "Error in strsplit(x, ", ") : non-character argument " – user5189602 Aug 05 '15 at 11:42
  • It is because you have factor variable. Try `strsplit(as.character(x), ', '` – akrun Aug 05 '15 at 11:45
  • My dear akrun, thanks for your help and sorry if i offended you. I think i am unable to explain my question but I really appreciate your help. Sorry for inconvenience. – user5189602 Aug 05 '15 at 12:14
  • As I already said, please provide `dput` of your data and your exact desired output. – David Arenburg Aug 05 '15 at 12:16
  • 1
    It's okay. Just now I checked your previous questions. All of them were not so clear. In [here](http://stackoverflow.com/questions/31809839/reading-columns-and-assigning-values-within-r) I suggested to update your question with the example I created and the expected result. But, it seems that you were not interested in doing that. So, I am thinking that you were asking questions just for fun. – akrun Aug 05 '15 at 12:17
  • @DavidArenburg: here is dput of first four rows... structure(list(SHO1 = structure(list(VH = c(0.9, 0.9, 1), VH = c(0.9, 0.9, 1), M = c(0.3, 0.5, 0.7), H = c(0.7, 0.9, 1)), .Names = c("VH", "VH", "M", "H")), SHO2 = structure(list(H = c(0.7, 0.9, 1), H = c(0.7, 0.9, 1), H = c(0.7, 0.9, 1), VH = c(0.9, 0.9, 1)), .Names = c("H", "H", "H", "VH")), SHO3 = structure(list(VH = c(0.9, 0.9, 1), VH = c(0.9, 0.9, 1), M = c(0.3, 0.5, 0.7), VH = c(0.9, 0.9, 1)), .Names = c("VH", "VH", "M", "VH"))), .Names = c("SHO1", "SHO2", "SHO3"), row.names = c(NA, 4L), class = "data.frame") – user5189602 Aug 05 '15 at 13:14
  • first row contains c(0.9, 0.9, 1.0) of SHO1 as first element c(0.7,0.9,1.0) of SHO2 as 2nd element and c(0.9,0.9,1.0) of SHO3 as 3rd element... my expected output should be a new column 0.9+0.7+0.9/3, 0.9+0.9+0.9/3, 1.0+1.0+1.0/3 .... i.e., taking the mean of these vectors element by element.... I hope i have explained the question well – user5189602 Aug 05 '15 at 13:20
  • Based on the dput output, you have `list` as each column of the dataset. `m1 <- Reduce('+', lapply(df1, function(x) do.call(rbind, x)))/ncol(df1);df1$newCol <- do.call(paste, c(as.data.frame(m1), sep=", "))` or you can have a `list` as the new column i.e. `df1$newCol <- split(m1, row(m1))` – akrun Aug 05 '15 at 13:35
  • 1
    @akrun Thanks very much for the help. – user5189602 Aug 05 '15 at 22:43

1 Answers1

2

Based on the dput output by the OP (in the comments), we found that the columns in 'df1' are not 'strings'. Infact each element of each column is a list. So, instead of doing strsplit (as I suggested earlier), we loop through the columns with lapply and rbind the list elements (do.call(rbind). The output 'list' contains 'matrix' as list elements.
We can use Reduce to take the elementwise sum (Reduce('+', ..), and divide by the length of the list i.e. 3.

The matrix output ('m1') can be pasted together rowwise (do.call(paste) after converting to 'data.frame' and create a new column in the original dataset ('df1').

m1 <- Reduce('+', lapply(df1, function(x) do.call(rbind, x)))/ncol(df1)
df1$newCol <- do.call(paste, c(as.data.frame(m1), sep=", "))
df1
#           SHO1          SHO2          SHO3
#1 0.9, 0.9, 1.0 0.7, 0.9, 1.0 0.9, 0.9, 1.0
#2 0.9, 0.9, 1.0 0.7, 0.9, 1.0 0.9, 0.9, 1.0
#3 0.3, 0.5, 0.7 0.7, 0.9, 1.0 0.3, 0.5, 0.7
#4 0.7, 0.9, 1.0 0.9, 0.9, 1.0 0.9, 0.9, 1.0
#                                     newCol
#1                 0.833333333333333, 0.9, 1
#2                 0.833333333333333, 0.9, 1
#3 0.433333333333333, 0.633333333333333, 0.8
#4                 0.833333333333333, 0.9, 1

data

df1 <-  structure(list(SHO1 = structure(list(VH = c(0.9, 0.9, 1), 
VH = c(0.9, 
0.9, 1), M = c(0.3, 0.5, 0.7), H = c(0.7, 0.9, 1)), .Names = c("VH", 
"VH", "M", "H")), SHO2 = structure(list(H = c(0.7, 0.9, 1), H = c(0.7, 
0.9, 1), H = c(0.7, 0.9, 1), VH = c(0.9, 0.9, 1)), .Names = c("H", 
"H", "H", "VH")), SHO3 = structure(list(VH = c(0.9, 0.9, 1), 
VH = c(0.9, 0.9, 1), M = c(0.3, 0.5, 0.7), VH = c(0.9, 0.9, 
1)), .Names = c("VH", "VH", "M", "VH"))), .Names = c("SHO1", 
"SHO2", "SHO3"), row.names = c(NA, 4L), class = "data.frame")
akrun
  • 874,273
  • 37
  • 540
  • 662
  • thanks for help but my output is like this: SHO1 SHO2 SHO3 TFN 1 0.7, 0.9, 1.0 0.9, 0.9, 1.0 0.7, 0.9, 1.0 NA, 0.9, NA 2 0.7, 0.9, 1.0 0.9, 0.9, 1.0 0.7, 0.9, 1.0 NA, 0.9, NA 3 0.0, 0.0, 0.1 0.9, 0.9, 1.0 0.0, 0.0, 0.1 NA, 0.3, NA 4 0.9, 0.9, 1.0 0.7, 0.9, 1.0 0.7, 0.9, 1.0 NA, 0.9, NA 5 0.9, 0.9, 1.0 0.7, 0.9, 1.0 0.7, 0.9, 1.0 NA, 0.9, NA for new column element 1 and 3 are not giving results instead NA while element 2 is right. I am trying to figure this out. – user5189602 Aug 05 '15 at 11:58