0

I have a json file which has the format:

{
    "results": [
        {
            "quiz": "1112세기에 발달한 고려의 대표적인 자기는 분청사기이다 ",
            "answer": " X "
        },
        {
            "quiz": "16세기 말 이탈리아 음악극의 흐름을 따르고, 전부 또는 일부 대사가 노래로 ",
            "answer": " X "
        },
        {
            "quiz": "1769년 세계최초로 자동차를 만든 사람은 ",
            "answer": "퀴노"
        }
    ]
}

And I want to get this file in JAVA area. I have a dependency com.googlecode.json-simple(1.1.1 version), and I have a code I have written which throws an exception :(

public List<CheatImported> importJsonFile(String path) throws IOException, FileNotFoundException, ParseException {

        JSONObject root = (JSONObject)jsonParser.parse(new FileReader(path));

        JSONArray results = (JSONArray)root.get("results");
        @SuppressWarnings("rawtypes")
        Iterator iter = results.iterator();

        List<CheatImported> resultList = new ArrayList<CheatImported>();

        while(iter.hasNext()){
            JSONObject item = (JSONObject)iter.next();
            String question = (String)item.get("quiz");
            String answer = (String)item.get("answer");

            CheatImported imported = new CheatImported();
            imported.setQuestion(question);
            imported.setAnswer(answer);

            resultList.add(imported);
        }

        return resultList;
    }

Type of ArrayList is a class that has just two string properties:

@Getter
@Setter
@NoArgsConstructor
public class CheatImported {

    private String question;
    private String answer;
}

And here is my junit code:

@Test
    public void cheatImported() throws Exception{
        String path = "D:\\workspace_orderByDate\\20180105\\moonBladeQuiz\\src\\main\\resources\\static\\data.json";
        List<CheatImported> list = importService.importJsonFile(path);
        assertTrue(list.size() > 0);
    }

Running the test code, it throws an exception(full trace):

Unexpected character () at position 0.
    at org.json.simple.parser.Yylex.yylex(Yylex.java:610)
    at org.json.simple.parser.JSONParser.nextToken(JSONParser.java:269)
    at org.json.simple.parser.JSONParser.parse(JSONParser.java:118)
    at org.json.simple.parser.JSONParser.parse(JSONParser.java:92)
    at com.ddedderu.moonBladeQuiz.data.service.ImportCheatServiceImpl.importJsonFile(ImportCheatServiceImpl.java:28)
    at com.ddedderu.moonBladeQuiz.MoonBladeQuizApplicationTests.cheatImported(MoonBladeQuizApplicationTests.java:231)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:75)
    at org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:86)
    at org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:84)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:252)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:94)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
    at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
    at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:70)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:191)
    at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
    at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:678)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)

What went wrong ??

  • Have you checked the encoding of the JSON file? What’s the code encoding you’re using? – Tomaz Fernandes Oct 21 '18 at 19:18
  • @TomazFernandes, Thank you for your response. The file's encoding type is UTF-8. –  Oct 21 '18 at 19:20
  • You should try opening the file in a editor such as Visual Studio Code or Sublime Text and set the encoding again. Pretty sure it’s a encoding problem. – Tomaz Fernandes Oct 21 '18 at 19:25
  • 1
    Check if the file has a Byte Order Mark. – MC Emperor Oct 21 '18 at 19:35
  • I have just changed the encoding type again in STS. but the test code still fails. So I downloaded the json file again, and re-changed the file character set to UTF-8 again.. still not worked.. I guess your solution is right so any idea to change character set correctly? –  Oct 21 '18 at 19:36
  • See [Remove a BOM character in a file](https://stackoverflow.com/q/32986445/5221149) – Andreas Oct 21 '18 at 19:44
  • @MCEmperor, I am trying to reset Byte Order Mark using EditPlus. Opening the json file, and getting to the preference window, changing the BOM setting to *Always insert new BOM*, and after that, I have saved the file. And I run the JUnit code, still didn't work . :( –  Oct 21 '18 at 20:08
  • @PLAYMAKER It's the other way around: you must **not** insert the byte order mark, instead **remove it**. Byte order is not applicable with UTF-8. A BOM character is not valid JSON. – MC Emperor Oct 21 '18 at 20:15
  • @Andreas, I have just installed Notepad++. I opened the json file, and clicked at convert to UTF-8 on the menu, nothing happend... and I just saved that file. After copying the file to my project, and run the code, still did not work ... Everyone says it's characterset problem.. and I think so too, why is it not fixed :( –  Oct 21 '18 at 20:17
  • You didn't tell `FileReader` to read the file as UTF-8. – Andreas Oct 21 '18 at 20:18
  • You could open it with a hex editor and remove the BOM, alternatively, you could open Notepad++ and set the character set to *"UTF-8 (without BOM)"*. – MC Emperor Oct 21 '18 at 20:19
  • @MCEmperor, Thank you! I followed your guide, and it worked! :). I configured EditPlus setting to **Remove BOM Always**, and run the test code, it gets correct data without any exceptions. Thank you! –  Oct 21 '18 at 20:28
  • I will post this as an answer to help further readers. You can then mark it as accepted. – MC Emperor Oct 21 '18 at 20:31
  • @Andreas Dealing with JSONParser instead of FileReader, could not have a choice to select a character set. Or is it possible any other ways? –  Oct 21 '18 at 20:32
  • @MCEmperor Ok Thank you! –  Oct 21 '18 at 20:32

1 Answers1

0

In such cases where a character or token is unexpected at position 0, while the JSON looks valid, it is almost always a problem with the Byte Order Mark.

From Wikipedia:

The byte order mark (BOM) is a Unicode character, U+FEFF BYTE ORDER MARK (BOM), whose appearance as a magic number at the start of a text stream can signal several things to a program consuming the text.

One of the functions of the mark is to signal which byte of multibyte characters comes first. This is called the endianness of the stream. With UTF-8, the order is set in stone, so the BOM does not serve any purpose within the context of UTF-8.

The JSON specs, however, does not allow any token other than whitespace or a JSON structure to be present. There is no exception for the byte order mark, hence the Byte Order Mark is not valid JSON.

You need to remove the byte order mark from the file in order to get this to work.

  • In Notepad++, open the file and select Encoding » UTF-8 (without BOM).
  • Alternatively, you could open the file with a hex editor and remove the first three bytes, which are 0xEF BB BF.
MC Emperor
  • 22,334
  • 15
  • 80
  • 130