1

I have written a program that returns me the language after detection when my input is hardcoded. I want the same results after iterating over my CSV cells and print the corresponding language in the next column.

I created a code that detects language when the input is hardcoded. I now have an excel sheet with some ID's and text in different languages. I want my program to read the excel cell by cell and print the result in neighbouring column

from textblob import TextBlob
import pycountry
b = TextBlob("Si esta yayo si esta yayo alla voy ")
iso_code = b.detect_language()  
# iso_code = "es"
language = pycountry.languages.get(alpha_2=iso_code)
print(language.name)

This is the Excel I want the program to iterate over

id  lyric language 
1   Hello how are you 
2   Wie geht es dir
3   cómo estás
4   நீங்கள் எப்படி இருக்கிறீர்கள்
5   Comment vas-tu

How can I remodify my code so that I get my expected results

Expected:

id  lyric language      Detected Language
1   Hello how are you         English
2   Wie geht es dir           German
3   cómo estás                Spanish
4   நீங்கள் எப்படி இருக்கிறீர்கள்     Tamil
5   Comment vas-tu            French
terry
  • 143
  • 13

1 Answers1

2

You didn't show how you want the Excel to be read. Depends on the library, you may have different way to read the Excel. But let's say you use pandas:

import pandas as pd
from textblob import TextBlob
import pycountry

def country(textstring):
    b = TextBlob(textstring)
    iso_code = b.detect_language()  
    language = pycountry.languages.get(alpha_2=iso_code)
    return language.name

df = pd.read_excel("myexcel.xlsx")
df["Detected Language"] = df["lyric language"].apply(country)
print(df.to_string())

This approach use pandas' Series.apply() method to find the language of every cell in the column lyric language, and assign the result to a new column.

adrtam
  • 6,991
  • 2
  • 12
  • 27
  • Thanks! This is great. Is it possible to write the new column and the data to the same excel sheet? – terry Feb 13 '19 at 20:59
  • 1
    Again, *if you're using pandas*, simply `df.to_excel("myexcel.xlsx")`, but actually what it does is to overwrite your old sheet. Otherwise you need a different way to open existing and manupulate instead of overwrite. – adrtam Feb 13 '19 at 21:03
  • Got it! Thanks for your help – terry Feb 13 '19 at 21:07
  • If I use this script for thousands of cell in excel, It gives me a error: http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 503: Service Unavailable Is there any alternative to textblb.. I read that Google Translate service for textblob is no longer free. Also, since my IP is now blocked I cant run this program at all.Is there any solution for this – terry Feb 13 '19 at 22:49
  • 1
    Not familiar with that. But see if this answer helps: https://stackoverflow.com/questions/3182268/nltk-and-language-detection – adrtam Feb 13 '19 at 23:47
  • I'm also seeing HTTPError: HTTP Error 400: Bad Request with a file over 1k rows. Has anyone else encountered this or found a solution for it? – 10VA Feb 02 '23 at 16:29