I write a function to reshape a logfile(600Mb) to get a dataframe, but I failed to get expectable result. The problem is I can get the expectable result when I run a logfile of about 10Mb, but it failed when I run the full file, it seem was led by memory allocation.
How do I fix it?
Reshape_data <-function(X) {
n<-length(X[,1])
Date<-vector(length=n)
Time_Temp<-vector(length=n)
status_Temp<-vector(length=n)
m <- 0
i <- 1
while(is.na(X[i,]) == FALSE) {
if (substr(X[i,],1,1)!="#") {
m<-m+1
Date[m]<-sapply((strsplit(as.character(X[i,])," ")), "[", 1)
Time_Temp[m]<-sapply((strsplit(as.character(X[i,])," ")), "[", 2)
status_Temp[m]<-sapply((strsplit(as.character(X[i,])," ")), "[", 11)
}
i <- i+1
}
if(m>0){
Time_Temp<-Time_Temp[1:m]
status_Temp<-status_Temp[1:m]
}else
{
Time_Temp<-NULL
status_Temp<-NULL
}
mydf<-data.frame(Time_Temp,status_Temp)
return(mydf)
}
AA_log<-read.table("ex130828.log",sep="\t",stringsAsFactors=FALSE,encoding="utf-8")
AA_Tidy<-Reshape_data(AA_log)