0

I have file with multiple lines, each one representing a document for my scenario. I searched how to create a corpus from it and found about R tm package function readPlain but that will load the whole text file as one document. I also found the way to load documents at R text file and text mining...how to load data but that specified the method which takes a folder path, and for each of the file in it, it creates a document.

How can I form different documents for each of the sentences.

Community
  • 1
  • 1
Bit Manipulator
  • 248
  • 1
  • 2
  • 17

1 Answers1

0

Try readLines("/path/to/yourfile.txt") Each line will be a different element in a text vector NLines long where Nlines is the number of lines in your document. Otherwise, see scan(). Both have a skip option if you need it, and an nlines option if you want to read it in chunks.

Serban Tanasa
  • 3,592
  • 2
  • 23
  • 45
  • Also, you can skip the path part if you use setwd("/path/tofile/") to set your working directory to wherever the file is located. If you do that, all you need in readLines(file="",...) is the filename and extension. – Serban Tanasa Dec 05 '14 at 19:32