0

My JSON file size 126MB. it contains only one line [no spaces , so the whole file is considered as single line]. I want to split it into files each 10MB[some random amount].

WHAT I HAVE TRIED?

  1. I tried to use filereader , streamreader etc . When i use reader.readLine() it throws me a memory error

  2. I tried Jackson library.

    File reader = new File("D:\\registry.txt");
    ObjectMapper map = new ObjectMapper(); 
    JsonParser jp = new JsonFactory().createJsonParser(reader);
    JsonNode masterJSON = map.readTree(jp);
    System.out.println(masterJSON); 
    

It also showing the same memory error . How can i do it?

MY ALTERNATIVE WORST SOLUTION I FOUND SO FAR?

Convert the file extension to .txt . And started reading char by char until i reach maximum splitting size.Again i have to change the file extension to .json.

Any easiest way to read one line file which is large[more than 100MB] in size??

Gibbs
  • 21,904
  • 13
  • 74
  • 138
  • Where is `reader` declared? that's kind of weird you're getting a `memory error` with a file that is 126MB, I've loaded `1GB` files without issues. Read [Parsing huge file without reading into memory](http://www.coderanch.com/t/201866/Performance/java/Parsing-huge-file-reading-memory), also mind adding a pastebin for the Error Log(s). – classicjonesynz Apr 23 '14 at 07:21
  • reader is a file object. I am also wondering why it shows insufficient heap space. – Gibbs Apr 23 '14 at 07:25
  • 2
    Use a streaming API like [the one](http://google-gson.googlecode.com/svn/trunk/gson/docs/javadocs/com/google/gson/stream/package-summary.html) in [Gson](https://code.google.com/p/google-gson/). – McDowell Apr 23 '14 at 07:25
  • @McDowell I was going to suggest Gson too. +1. – classicjonesynz Apr 23 '14 at 07:26
  • I tried the links. It is showing that gson is not defined at gson.fromJson and error on Message also.[i added gson library also] – Gibbs Apr 23 '14 at 07:55
  • [Using StreamingJsonBuilder](http://groovy.codehaus.org/gapi/index.html?groovy/json/StreamingJsonBuilder.html) to handle JSON streams in groovy (faster than jackson actually). – ludo_rj Apr 23 '14 at 08:09
  • I hope so. Thanks , I ll try groovy also. – Gibbs Apr 23 '14 at 08:12
  • 2
    You can try Jackson Streaming API (https://www.google.com/search?q=Jackson+streaming+API). When you split one JSON object it will be broken, won't it? Is it acceptable? If yes, you can see this question: http://stackoverflow.com/questions/19177994/java-read-file-and-split-into-multiple-files – Michał Ziober Apr 23 '14 at 08:43
  • @ludo_rj whether it's faster or not is not guaranteed, despite recent hype based on certain benchmarks -- but regardless, disk read (I/O) will be limiting factor, not parsing. – StaxMan Apr 23 '14 at 23:35
  • @StaxMan my last comment was focused on the serialization process. More infor about [JSON serialization benchmarks](https://github.com/bura/json-benchmarks) comparing GSon, Jackson, Groovy...The next release of groovy 2.3 should be handsome for that. – ludo_rj Apr 24 '14 at 07:19

0 Answers0