I would appreciate some help on the following problem:
I have mutiple huge Logfiles (>1.000.000 entries each) which contain some lines (rows) that are of particular interest for me. So I want to make a subset containing just these lines, but I want to write the result in matrix containing the information for more then one Logfile/Participant. So I created a short line of code to 1. create the subset and 2. run it within a loop, to do it not only for one of the Logfiles, but for all of them.
Result <- subset(df, df$columnOfInterest== "interestingCondition1" | df$columnOfInterest== "interestingCondition2" | df$columnOfInterest== "interestingCondition3")$columnOfInterest
View(Result)
1
interestingCondition1
2
interestingCondition1
3
interestingCondition2
4
interestingCondition1
5
interestingCondition1
6
interestingCondition3
7
interestingCondition2
8
interestingCondition1
9
interestingCondition1
10
interestingCondition1
Embeded into a loop:
WrongResult <- matrix(data=NA,nrow=TrialNumber, ncol=length(ListOfFiles))
vpncount <- 1
for (v in ListOfFiles){
df<- read.delim(v, header = TRUE, sep='\t')
WrongResult[,vpncount] <- subset(df, df$columnOfInterest== "interestingCondition1" | df$columnOfInterest== "interestingCondition2" | df$columnOfInterest== "interestingCondition3")$columnOfInterest
vpncount <- vpncount+1
}
When running the code on one Logfile I get the result I would like, but when running it through the loop, it creates a matrix with the appropiate size, but just filled with "random" numbers instead of the conditions I subdivded for.
Does anyone knows why that happens and how to fix it? Any help is appreciated a lot!
EDIT:
I tried to create an example data frame. The first line of code (including the variable Results) works just as I want it to be. It filters my dataframe on the rows of my columnOfInterest and puts them into a new matrix. But if I try to run it within a loop for more then one dataframe I keep running into errors:
df <- data.frame(
X = sample(1:10),
columnOfInterest= sample(c("interestingCondition1", "interestingCondition2", "interestingCondition3", "NotinterestingCondition1"), 10, replace = TRUE)
)
View(df)
Result <- subset(df, df$columnOfInterest== "interestingCondition1" | df$columnOfInterest== "interestingCondition2" | df$columnOfInterest== "interestingCondition3")$columnOfInterest
View(Result)
WrongResult <- matrix(data=NA,nrow=280, ncol=20)
vpncount <- 1
for (v in 1:20){
df<- read.delim(v, header = TRUE, sep='\t')
WrongResult[,vpncount] <- subset(df, df$columnOfInterest== "interestingCondition1" | df$columnOfInterest== "interestingCondition2" | df$columnOfInterest== "interestingCondition3")$columnOfInterest
vpncount <- vpncount+1
}
View(WrongResult)