R experts,
I have a large text file, which has specific pattern and format.
My text.txt contains
x1 `xx`nkkna`yy`taktnaknvcaklrhkahnktn, altlkhakthakd`xx`nmm cataitha`yy`knkcnaktnhakt
x2 `xx`ngkna`yy`taktnaknvcaklrhkahnktn, altlkhakthakdnmm cataithaknkcnaktnhakt
x3 `xx`nkg,kna`yy`taktnaknvcaklrhkahnktn, altlkhakthakdnmm cataithaknk`xx`cna`yy`ktnhakt
x4 nkkndataktnaknvcaklrhkahnktn, altlkhakthakdnmm cataithaknkcnaktnhakt
Then, I want to ask R to find a list of words, in this case is x1, x2, x3 and x4 And inbetween, I want to get a list for each of them, that is between "xx" and "yy".
As such, the results will be four lists
x1 = c("nkkna", "nmm cataitha")
x2 = c("ngkna")
x3 = c("nkg,kna", "cna")
x4 = c("NA")
However, I am facing two problems would like to ask for your help.
- how to readin a large text file to R? I learn from stackoverflow that the command
x <- read.csv(textConnection"xxx") may help, but the problem is my file is too large to be copy and past, and the file should be be readin as csv. Are there any much better way to load my text file to R as an object that can be search and grep afterwards?
- how to write a code to get all these information?
I learn strsplit maybe used, it seems to work in RCurl scrapped materials, does it work here too? If yes, could you mind to teach me how?
Thank you so much.....