I am trying to extract Named Entities from text using Spacy's NER for German text. I have exposed the service as a REST POST request which takes source text as input and returns a dictionary(Map) of list of named entities (person, location, organization). These services are exposed using Flask Restplus hosted on a linux server.
Consider for a sample text, I get following response using POST request at REST API exposed via Swagger UI:
{
"ner_locations": [
"Deutschland",
"Niederlanden"
],
"ner_organizations": [
"Miele & Cie. KG",
"Bayer CropScience AG"
],
"ner_persons": [
"Sebastian Krause",
"Alex Schröder"
]
}
When I use Spring's RestTemplate to POST request at the API hosted at Linux server from Spring boot application (on Windows OS in Eclipse). The json parsing is done correctly. I have added following line for using UTF-8 encoding.
restTemplate.getMessageConverters().add(0, new StringHttpMessageConverter(Charset.forName("UTF-8")));
But When I deploy this spring boot application on linux machine and POST request to API for NER tagging, the ner_persons are not parsed correctly. While remotely debugging, I get following response
{
"ner_locations": [
"Deutschland",
"Niederlanden"
],
"ner_organizations": [
"Miele & Cie. KG",
"Bayer CropScience AG"
],
"ner_persons": [
"Sebastian ",
"Krause",
"Alex ",
"Schröder"
]
}
I am not able to understand why this strange behavior occurs in case of persons but not organizations.