0

I have the feeling I am close but I cannot get it to work and your help would be greatly appreciated.

My goal is to subset data in a list based on the value of 1 of the factors. It is data concerning subjects that have been exposed to stimuli at varying timepoints. Now I want to subset the data of all variables for all subject per stimulus. The variable of the stimulus is called 'Stimulus'. And stimulus has name for example "Happy 8". So example path: SubjList$Subject1$Stimulus["Happy 8"] (although this also doesn't work.

My dataframe has the following structure:

Subjdf Large list (38 elements)

Each element is a data.frame with around 4000 observations(varying) and 26 variables (including "Stimulus")

Now I can subset one column over all subjects (elements) by doing the following:

 ColSub <- (lapply(SubjList,'[[','Stimulus'))

But when I try to implement a condition it does not work.

Happy8 <- (lapply(SubjList,'[[','Stimulus'=='Happy 8'))

Not does simple selection methods like:

Happy8 <- SubjList$Subject1$Stimulus["Happy 8", ]

So, I there a way to subset only the rows that follow the condition op "Stimulus"=="Happy 8". and the create a list of of same subjects with same variables but only the observations of Stimulus Happy 8.

Thank you in advance!

J.Jansen
  • 17
  • 7
  • Reproducible example would help: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963379 – Bulat May 26 '16 at 21:20
  • I tried to recreate the issue, but with the examples your answer works. I don't know why but the main differences are: 1) the data I use is loaded by the following formula: Subjdf <- setNames(lapply(paste0(nm, ".txt"), read.delim), nm) 2) the structure of the list has an extra name in my data: my data: $ P1a:'data.frame': 3720 obs. of 26 variables: created list : $ :'data.frame': 20 obs. of 3 variables: There is no name if I create a list manually, whereas the way I loaded it, there is. Probably due to the setNames function. – J.Jansen May 30 '16 at 17:04

1 Answers1

0

Here is what you can do. Names of data frames and columns are different:

DF1 <- data.frame(year = c(seq(2000,2012,by=1)), 
                    C = runif(13,0,1))
DF2 <- data.frame(year = c(seq(2000,2012,by=1)), 
                  C = runif(13,0,1))

DL <- list(DF1, DF2)

ColSub <- (lapply(DL, function(DF) {DF[DF$year >= 2005, "C"]}))
ColSub

This should provide you with the idea how to change your code.

Bulat
  • 6,869
  • 1
  • 29
  • 52
  • Thank you for you answer! I do not seem to get it working in my dataset. ColSubS <- (lapply(Subjdf, function(nm) {nm[nm$Stimulus == "Happy 8", "Video Time"]})) Where nm is vector with the data.frame names ("P1a","P1b","P2a","P2b","P3a") etc). However, when I run it I do not get the Video Time scores corresponding with the rows in which $Stimulus =="Happy 8", but rather empty values like this: $P1a NULL $P1b NULL $P2a NULL $P2b NULL. I have the feeling it is because I use a vector with the data.frame names where you use DF, But my equivalent (P) does not work either – J.Jansen May 27 '16 at 18:55