0

I am trying to load the below Json data into Pig in my HDP.

$ cat myfile1.json

[
    {
        color: "red",
        value: "#f00"
    },
    {
        color: "green",
        value: "#0f0"
    },
    {
        color: "blue",
        value: "#00f"
    },
    {
        color: "cyan",
        value: "#0ff"
    },
    {
        color: "magenta",
        value: "#f0f"
    },
    {
        color: "yellow",
        value: "#ff0"
    },
    {
        color: "black",
        value: "#000"
    }
]

I tried the below Pig statement and getting the ERROR. I could not able to find the error. What is the correct format that I should mention in the JsonLoader.

a = load '/root/myfile1.json' using JsonLoader('{(color:chararray,value:chararray)}');

dump a;

java.lang.Exception: org.codehaus.jackson.JsonParseException: Unexpected end-of-input within/between OBJECT entries
 at [Source: java.io.ByteArrayInputStream@27c85f78; line: 1, column: 35]
        at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: org.codehaus.jackson.JsonParseException: Unexpected end-of-input within/between OBJECT entries
 at [Source: java.io.ByteArrayInputStream@27c85f78; line: 1, column: 35]
        at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
        at org.codehaus.jackson.impl.Utf8StreamParser._skipWS(Utf8StreamParser.java:1817)
        at org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:315)
        at org.apache.pig.builtin.JsonLoader.getNext(JsonLoader.java:167)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
        at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
        at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
2016-01-19 03:59:27,053 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2016-01-19 03:59:27,053 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1639714432_0007 has failed! Stop running all dependent jobs
2016-01-19 03:59:27,053 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2016-01-19 03:59:27,054 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2016-01-19 03:59:27,054 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats reported below may be incomplete
2016-01-19 03:59:27,055 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
Bogdan Burym
  • 5,482
  • 2
  • 27
  • 46
Wanderer
  • 447
  • 3
  • 11
  • 20

1 Answers1

0

I think your first line of code should look like below

a = load '/root/myfile1.json' using JsonLoader('color:chararray,value:chararray');

See this link for an example.

UPDATE Added compacted version of JSON data file

[
 {color: "red",value: "#f00"},
 {color: "green",value: "#0f0"},
 {color: "blue",value: "#00f"},
 {color: "cyan",value: "#0ff"},
 {color: "magenta",value: "#f0f"},
 {color: "yellow",value: "#ff0"},
 {color: "black",value: "#000"}
]

Hope this helps.

vmachan
  • 1,672
  • 1
  • 10
  • 10
  • In your data file could you try removing the `[` and `]` brackets at the start and end of the file and then try again? – vmachan Jan 19 '16 at 12:56
  • I removed the square bracket at start and end of the file. but still the error persist. – Wanderer Jan 19 '16 at 13:12
  • I think you need to compact your JSON data as it seems this is not "liked" by Pig. See this [SO post](http://stackoverflow.com/questions/31594652/jsonloader-throws-error-in-pig). Hope this works for you – vmachan Jan 19 '16 at 13:33
  • still I could not able to load succesfully. Can you try it once from your side? – Wanderer Jan 20 '16 at 06:17