0
How to read a large JSON file ?

    {   
    "Count": 361888,
    "Items": 
    [
    {   "S3Url": {"S": Grouper/1904/1/private/drafts/D1_2/siepon_D1_2/siepon_C11_D1‌​_2_diff.pdf" },
        "JSONFile": {"S": Grouper/1904/1/private/drafts/D1_2/siepon_D1_2/siepon_C11_D1‌​_2_diff.pdf.json" },
        "ErrTs": {"N": "1488010286704"}
    },
    {   "S3Url": {"S": Mentor/47200043/Public/07/11-07-1984-05-000s-june-2007-mesh-‌​ad-hoc- agenda.ppt.pdf" },
        "JSONFile": {"S": "Mentor/47200043/Public/07/11-07-1984-05-000s-june-2007- mesh-ad-hoc-agenda.ppt.pdf.json"},
        "ErrTs": {"N": "1490497271699"}
    }
    ],
    "ScannedCount": 23
    }

This is the input JSON File format. File is too large so cannot use:
*Jsonparser parser=new Jsonparser();
*Object obj=parser.parse(new FileReader(JSON_FILE_PATH))
Error is :
java.lang.OutOfMemoryError: Java heap space
increase the maximum heap size by using JVM options "-Xmx512M" won't work.
tried the code :
     jsonParser.parse(new FileReader(JSON_FILE_PATH), new ContentHandler() {
        private String key;
        private Object value;

        // A bunch of "default" methods
        @Override public void startJSON() { }
        @Override public void endJSON() { }
        @Override public boolean startObject() { return true; }
        @Override public boolean endObject() { return true; }
        @Override public boolean startArray() { return true; }
        @Override public boolean endArray() { return true; }

        @Override
        public boolean startObjectEntry(final String key) {
            this.key = key;
            return true;
        }

        @Override
        public boolean endObjectEntry() {
            System.out.println(key + " => " + value);
            return true;
        }

        @Override
        public boolean primitive(final Object value) {
            this.value = value;
            return true;
        }
    });
    }

Expected Output: key : S3Url value : Grouper/1904/1/private/drafts/D1_2/siepon_D1_2/siepon_C11_D1‌​_2_diff.pdf in excel

Actual Output: key : S value : Grouper/1904/1/private/drafts/D1_2/siepon_D1_2/siepon_C11_D1‌​_2_diff.pdf in excel key : S value : Grouper/1904/1/private/drafts/D1_2/siepon_D1_2/siepon_C11_D1‌​_2_diff.pdf in excel

which is repeating. So please help to read the large Json file in required format.
  • Is file very large? – Fady Saad Apr 17 '17 at 06:41
  • Yes file size is approx 400 MB – Pranjal Ahluwalia Apr 17 '17 at 06:43
  • 1
    There similar [question](http://stackoverflow.com/questions/9390368/java-best-approach-to-parse-huge-extra-large-json-file) I think it will help you – Fady Saad Apr 17 '17 at 06:50
  • Mate I believe you know it's not a "write-code-for-me" service, so I recommended you how to deal with your huge JSON parsing issue and that was [your initiate question](http://stackoverflow.com/revisions/add77e1a-da08-4c39-b4cb-8d11b2ed7138/view-source). Now you're modifying it to a different question borrowing the code from my answer. You have to implement the `ContentHandler` yourself. I can give you one note only: if you're still having `OutOfMemoryError` then you're probably gathering the parsed data to memory instead of writing whem elsewhere (remains unclear -- there is no stacktrace). – Lyubomyr Shaydariv Apr 18 '17 at 10:57

2 Answers2

0

This error can be caused due to memory leak

How to solve java.lang.OutOfMemoryError: Java heap space

1) An easy way to solve OutOfMemoryError in java is to increase the maximum heap size by using JVM options "-Xmx512M", this will immediately solve your OutOfMemoryError. This is my preferred solution when I get OutOfMemoryError in Eclipse, Maven or ANT while building project because based upon size of project you can easily run out of Memory.here is an example of increasing maximum heap size of JVM, Also its better to keep -Xmx to -Xms ration either 1:1 or 1:1.5 if you are setting heap size in your java application

export JVM_ARGS="-Xms1024m -Xmx1024m"

2) The second way to resolve OutOfMemoryError in Java is rather hard and comes when you don't have much memory and even after increase maximum heap size you are still getting java.lang.OutOfMemoryError, in this case, you probably want to profile your application and look for any memory leak. You can use Eclipse Memory Analyzer to examine your heap dump or you can use any profiler like Netbeans or JProbe. This is tough solution and requires some time to analyze and find memory leaks.

Tools to investigate and fix OutOfMemoryError in Java

1) Visualgc

2) Jmap

3) Jhat

4) Eclipse memory analyzer

5) Books to learn Profiling

Read more: here

BhandariS
  • 606
  • 8
  • 20
0

You're getting this error because your JVM cannot allocate memory enough to store the result JSONObject instance that's a subclass of HashMap (and that is clear according to the stacktrace). Although you claim to have a 400MB JSON document, it may be smaller comparing to other JSON documents, and increasing the memory size won't help you much. You can parse the given JSON document with almost zero-cost from the JVM resources perspective using streaming, but you have to write more sophisticated code. com.googlecode.json-simple:json-simple supports streamed reading via using ContentHandlers.

Example:

{
    "foo": 1,
    "bar": 2
}
try ( final Reader reader = getPackageResourceReader(Q43446452.class, "document.json") ) {
    final JSONParser jsonParser = new JSONParser();
    jsonParser.parse(reader, new ContentHandler() {
        private String key;
        private Object value;

        // A bunch of "default" methods
        @Override public void startJSON() { }
        @Override public void endJSON() { }
        @Override public boolean startObject() { return true; }
        @Override public boolean endObject() { return true; }
        @Override public boolean startArray() { return true; }
        @Override public boolean endArray() { return true; }

        @Override
        public boolean startObjectEntry(final String key) {
            this.key = key;
            return true;
        }

        @Override
        public boolean endObjectEntry() {
            System.out.println(key + " => " + value);
            return true;
        }

        @Override
        public boolean primitive(final Object value) {
            this.value = value;
            return true;
        }
    });
}

Sure, it's an extremely primitive example, and there is a cost for you, not for JVM, but you can parse even infinite JSON streams using such an approach.

Output:

foo => 1
bar => 2

Lyubomyr Shaydariv
  • 20,327
  • 12
  • 64
  • 105
  • can you give a link or something I am not able to test your code – Pranjal Ahluwalia Apr 17 '17 at 09:38
  • @PranjalAhluwalia What problem are you faced with? – Lyubomyr Shaydariv Apr 17 '17 at 10:02
  • @ Lyubomyr Shaydariv http://stackoverflow.com/questions/43383932/i-have-a-json-text-file-in-3-level-array-format-how-to-get-array-values-like-s – Pranjal Ahluwalia Apr 17 '17 at 12:44
  • the problem is I am reading from the file so jsonParser.parse(reader, new ContentHandler() { is throwing error and also not able locate jars for Reader reader = getPackageResourceReader(Q43446452.class, "document.json") I am using java 1.7 – Pranjal Ahluwalia Apr 17 '17 at 12:46
  • @PranjalAhluwalia This is a custom reader I use to extract bundled resources. Replace the above `Reader` instance with your `new FileReader(JSON_FILE_PATH)`. – Lyubomyr Shaydariv Apr 17 '17 at 12:52
  • @ Lyubomyr Shaydariv { "Count": 361888, "Items": [{"S3Url": {"S": Grouper/1904/1/private/drafts/D1_2/siepon_D1_2/siepon_C11_D1_2_diff.pdf" },"JSONFile": {"S": Grouper/1904/1/private/drafts/D1_2/siepon_D1_2/siepon_C11_D1_2_diff.pdf.json" },"ErrTs": {"N": "1488010286704"}}, {"S3Url": {"S": Mentor/47200043/Public/07/11-07-1984-05-000s-june-2007-mesh-ad-hoc- agenda.ppt.pdf" },"JSONFile": {"S": "Mentor/47200043/Public/07/11-07-1984-05-000s-june-2007- mesh-ad-hoc-agenda.ppt.pdf.json"},"ErrTs": {"N": "1490497271699"}}],"ScannedCount": 23} – Pranjal Ahluwalia Apr 18 '17 at 10:02
  • Expected Output: key : S3Url value : Grouper/1904/1/private/drafts/D1_2/siepon_D1_2/siepon_C11_D1_2_diff.pdf in excel – Pranjal Ahluwalia Apr 18 '17 at 10:03
  • Actual Output: key : S value ; Grouper/1904/1/private/drafts/D1_2/siepon_D1_2/siepon_C11_D1_2_diff.pdf in excel key : S value ; Grouper/1904/1/private/drafts/D1_2/siepon_D1_2/siepon_C11_D1_2_diff.pdf in excel which is repeating – Pranjal Ahluwalia Apr 18 '17 at 10:04