0

Please excuse my imprecise terminology. For example, I split a dataframe into subsets that I wanted, but according to Rstudio, but my result is a list. I am confused by the terms, so I am having trouble searching in SO for answers.

My question is how do you apply a function to remove outliers from subsets in a list? My data in a dataframe (see2):

Id <- c(5,34,55,84,105,134,155,5,34,55,84,5,34,55,84,105,134,155,184,5,34,55,84,105,134,    155,184)  
A <- c(230185,1472449,870581,269359,527566,937805,1361685,209868,282024,244880,228502, 129072,143122,89994,106535,108124,97962,75841,97366,96382,64324,66834,60787,79516,92829,120894,62763)    

I used this code to split the data into subsets with 5 as the identifier break.

df <- data.frame(Id, A)
see2 <- df[c(1, seq(3, nrow(df), 3)),]
see2[,1] == "5"
result <- split(see2, cumsum(see2[,1]=="5"))

Using see2$'1' as an example, I would like to test each set see2$'#' for outliers. How do I do that? Thank you very much for your help.

  • I don't know what you're trying to do here exactly, but I'm pretty confident that the code you've posted is not doing it... What are you expecting `see2[,1] == "5"` to achieve? – alexwhan Sep 13 '13 at 03:05
  • I think it may be related to this previous question: http://stackoverflow.com/questions/18752049/subsetting-matrix-with-id-from-another-matrix This question is missing the splitting code though - maybe something like: `split(see2,cumsum(see2[,1]==5))` ? – thelatemail Sep 13 '13 at 03:36
  • 5
    Most experienced users of R have a reverence for data that inhibits them from responding to this request. It's not a difficult maneuver, but we want to hear a justification for what most would consider statistical malpractice. – IRTFM Sep 13 '13 at 03:37
  • @DWin These are standards, so theoretically they should be +/- 5% of the known values. I need to know whether there is contamination in vial or if the instrument was malfunctioning. – user2770184 Sep 13 '13 at 21:29
  • @user2770184: Removing data most definitely NOT the purpose of standards. – IRTFM Sep 14 '13 at 16:29
  • @DWin: Pardon me for being unclear. I am not using standards to remove data. All the data I am working with right now is just standards. Sometimes, the instrument hits the tray rather than the sample and gives a 0 as the output. Other times, the vial is dirty so the resulting Area for the standard is much higher than it should be. – user2770184 Sep 15 '13 at 01:24
  • I do not think you are being unclear, only unwise. Those are not conditions that are likely to be properly determined or assessed by the use of limits "+/- 5% of known values". – IRTFM Sep 15 '13 at 02:32

0 Answers0