3

My large_file.txt contains <tag> to represent data such as parents, childs and so on. My large_file.txt format is not xml format.

I want to read large_file.txt size 100 MB as a one string, then using String matching to get my data as a object.

Any help please?

Thanks in advance

Behrang
  • 46,888
  • 25
  • 118
  • 160
Idunk
  • 90
  • 2
  • 10

3 Answers3

5

I want to read large_file.txt size 100 MB as a one string

You really don't want to do that. You want to do anything you can to avoid it. You should always aim to process files a piece at a time. What happens when the file gets 10 times the size? 100? 1000?

user207421
  • 305,947
  • 44
  • 307
  • 483
  • 2
    I disagree. Sometimes people give you data that you must process. – Mikhail Oct 10 '11 at 01:35
  • @Misha I've never seen, or heard of, and cannot imagine, 100MB of data that has to be processed in memory and cannot be processed a piece at a time, and my question about what happens when it increases 10x, 100x, 1000x remains valid. You cannot design on the assumption that all data will fit in memory, either when producing the file or when processing it. – user207421 Oct 10 '11 at 02:03
  • thanks, but, I need reading all of text for finding their patter, because if that text was catting, then the patter will never found. – Idunk Oct 19 '11 at 09:10
  • @ldunk what? Patter? Catting? Please restate that in standard English. – user207421 Oct 19 '11 at 09:43
  • @EJP I think he's saying there is a patterN in the text that won't be found if the text is in pieces. – Michael Dec 30 '14 at 03:19
2

You can use Jakarta Regexp's StreamCharacterIterator. This way you can directly apply string matching on the file without reading it first into a String object.

Otherwise you can use Commons IO's FileUtils.

Behrang
  • 46,888
  • 25
  • 118
  • 160
  • StreamCharacterIterator is simply a wrapper around a StringBuffer and an InputStream, and shares the same encoding-related defect as [StringBufferInputStream](http://download.oracle.com/javase/7/docs/api/java/io/StringBufferInputStream.html). Java's built-in [Scanner](http://download.oracle.com/javase/7/docs/api/java/util/Scanner.html) class meets this need much better. – Alan Moore Oct 09 '11 at 11:14