So I'm looking for a way to speed up the output of the following code, calling google's natural language API:
tweets = json.load(input)
client = language.LanguageServiceClient()
sentiment_tweets = []
iterations = 1000
start = timeit.default_timer()
for i, text in enumerate(d['text'] for d in tweets):
document = types.Document(
content=text,
type=enums.Document.Type.PLAIN_TEXT)
sentiment = client.analyze_sentiment(document=document).document_sentiment
results = {'text': text, 'sentiment':sentiment.score, 'magnitude':sentiment.magnitude}
sentiment_tweets.append(results)
if (i % iterations) == 0:
print(i, " tweets processed")
sentiment_tweets_json = [json.dumps(sentiments) for sentiments in sentiment_tweets]
stop = timeit.default_timer()
The issue is the tweets list is around 100k entries, iterating and making calls one by one does not produce an output on a feasible timescale. I'm exploring potentially using asyncio for parallel calls, although as I'm still a beginner with Python and unfamiliar with the package, I'm not sure if you can make a function a coroutine with itself such that each instance of the function iterates through the list as expected, progressing sequentially. There is also the question of managing the total number of calls made by the app to be within the defined quota limits of the API. Just wanted to know if I was going in the right direction.