9

I've searched thoroughly on Stack Overflow but couldn't find an answer to this problem. I'm trying to use the Google Translate API (googletrans 2.2.0) for Python (3.6.2) and am trying to translate a set of non-English documents into English. I am letting Google Translate do the language detection. Here is my code:

## newcorpus is a corpus I have created consisting of non-english documents
fileids = newcorpus.fileids
for f in fileids:
    p = newcorpus.raw(f) 
    p = str(p[:15000])
    translated_text = translator.translate(p)
    print(translated_text)
    sleep(10)

I am throttling my call to the API by waiting 10 seconds every time. I am also only feeding the API 15k characters at a time to remain within the character limit.

Every time I run this code I get the following error message:

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Can anybody help?

Jongware
  • 22,200
  • 8
  • 54
  • 100
Roald Schuring
  • 179
  • 1
  • 3
  • 13
  • Two errors here: 1. if `p` is a dict `str(p)` will gives a non-valid JSON string. 2. if `p` is raw JSON string, `p[:1500]` will chunk JSON string and make it non-valid JSON string. – Arount Dec 29 '17 at 10:58
  • Thanks Arount. Strangely I get the exact same JSON Decoder error when I remove the line "p = str(p[:15000])"... any other ideas as to what might be going on? – Roald Schuring Dec 29 '17 at 11:04

2 Answers2

4

I think I may have found an answer to my own question. If I reduce the number of characters I feed to the API at once to 5k, everything seems to work fine. Strange since the Googletrans documentation says that the limit is 15k... Ah well. I will have to batch the request.

Roald Schuring
  • 179
  • 1
  • 3
  • 13
3

You have to stop using googletrans until they fix it, and use translate instead :

https://pypi.org/project/translate/

mounirboulwafa
  • 1,587
  • 17
  • 18