I have CSV file in which I have stored tweets from twitter. Some of them are non-English for them I am using AWS-translate service.
I am converting my CSV into dataframe and then trying to create new column for translated tweet text but for some tweets because of low confidence on language detection it is showing error and code is not moving ahead.
I want to skip for these error generating text and want to move ahead in code for further execution.
Following error is showing up:
DetectedLanguageLowConfidenceException: An error occurred (DetectedLanguageLowConfidenceException) when calling the TranslateText operation: Translate request rejected due to low confidence of auto detected source language 'fr'. Specify a valid source language code to force translation.
Here is the code i am trying to get my output. jap.csv is my tweet stored CSV. using this CSV i have created df name translated. 'text' is columns in which tweet text is present and translated_text is new column where i am storing translated text.
import boto3
import aws_credentials
import pandas as pd
translate = boto3.client('translate',aws_access_key_id= aws_credentials.key_id,aws_secret_access_key= aws_credentials.secret_key,
region_name='us-west-2')
translated = pd.read_csv('jap.csv')
translated['Translated_text'] = translated['text']
translated['Orginal_text_lang']= 'en'
for i, row in translated.iterrows():
result = translate.translate_text(Text= row['text'],
SourceLanguageCode='auto', TargetLanguageCode="en")
T_text= result.get('TranslatedText')
So_lg= result.get('SourceLanguageCode')
translated.at[i,'Translated_text']= T_text
translated.at[i,'Orginal_text_lang']= So_lg
translated.to_csv('translated.csv')
I want to skip all of those text which produces such errors and code should execute till end and produce output CSV for translated text.