0

I'm working with another API which calls the google Document AI API. I'm trying to read the JSON String from the file into a Document object. How should this be done?

I tried the following but it is not working.

import com.google.cloud.documentai.v1.Document;
import java.io.FileInputStream;

Document document = Document.parseFrom(new FileInputStream("src/main/resources/responseFromAPICall.json"));
System.out.println(document.getText());

I'm getting this error:

Exception in thread "main" com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.
    at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:129)
    at com.google.protobuf.CodedInputStream$StreamDecoder.checkLastTagWas(CodedInputStream.java:2124)
    at com.google.protobuf.CodedInputStream$StreamDecoder.readGroup(CodedInputStream.java:2358)
Jose Gutierrez Paliza
  • 1,373
  • 1
  • 5
  • 12
Abdulbasith
  • 139
  • 1
  • 8

1 Answers1

3

Today I came across this issue as well. This answer gave me the starting point for a solution.

If your json file was saved from a call to Document AI and looks like:

{
  "document": {
    ...
    "text": "...",
    ...
  },
  "humanReviewStatus": {...}
}

you may use the following code snippet:

import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

import com.google.cloud.documentai.v1.Document;
import com.google.cloud.documentai.v1.ProcessResponse;
import com.google.protobuf.util.JsonFormat;

Path filePath = Paths.get("src/main/resources/responseFromAPICall.json");
ProcessResponse.Builder responseBuilder = ProcessResponse.newBuilder();
JsonFormat.parser().merge(Files.newBufferedReader(filePath), responseBuilder);
Document document = responseBuilder.getDocument();
System.out.println(document.getText());

If your json file only contains the "document" object:

{
  ...
  "text": "...",
  ...
}

This code will do the trick:

import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

import com.google.cloud.documentai.v1.Document;
import com.google.protobuf.util.JsonFormat;

Path filePath = Paths.get("src/main/resources/responseFromAPICall.json");
Document.Builder docBuilder = Document.newBuilder();
JsonFormat.parser().merge(Files.newBufferedReader(filePath), docBuilder);
System.out.println(docBuilder.getText());