I have a csv file of 1.2M rows and need to clean up multiple columns in the file. Basically remove items after an underscore in the 7 columns that need to be cleaned. The following code works takes 3 days to complete. I have a Perl script which does the same in 30 seconds but trying to keep everything within R if possible. Any suggestions?
v<-c(11:14,16:18)
systime<-Sys.time()
for(m in 1:length(v)){
for(i in 1:nrow(shots[,v[m]])){
shots[i,v[m]]<-unlist(strsplit(shots[i,v[m]] [[1]],split='_',fixed=TRUE))[1]
}
}
Need to remove the data after the underscore.
The following are the column names showing some of the data needed and what needs to be kept and removed in particular columns.
1 42.30000 586.39276 Ground Name Name1 KEEP_remove KEEP_remove_remove No Mount KEEP_remove_remove 1 1