Building on 13f23f3f answer.
I think googletrans (free) and the the language dictionary is a good start.
To start off the dictionary langCodes
can be expanded to support multiple languages from one country. You can't just define "Switzerland":"French" and "Switzerland":"German"
but instead you can use an array of all possible languages. For example:
langCodes = {
"United States":["en"],
"France":["fr"],
"Romania":["ro"],
"Switzerland":["de", "fr"]
}
With langCodes
being an dictionary with an array of all possible languages for each country. With that being said here is a full example using pandas
and googletrans
import pandas as pd
from googletrans import Translator
translator = Translator() #init translator
# proxies uncomment to use proxies (for large amount of translations)
# if these do not work you might need to find other html proxies online
#proxiesArray = [{'http':"134.122.19.151:3128"},{'http':"68.183.115.230:8080"},{'http':'104.129.196.153:10605'},{'http':'35.230.21.108:80'}]
#word list
l=['spring','summer','fall','winter']
#langCodes More can be added by looking at the googletrans documentation
langCodes = {
"United States":["en"],
"France":["fr"],
"China":["zh-cn"],
"Japan":["ja"],
"Turkey":["tr"],
"Romania":["ro"],
"Switzerland":["de", "fr"]
}
#df1 names and countries using pandas
df1 = pd.DataFrame([["Tom","United States"],
["Sam","France"],
["Tim","China"],
["Andrew","Japan"],
["Bess","Turkey"],
["Sara","Romania"],
["Jeff","Switzerland"]],
columns=["Name","Country"])
#df2 initialize
df2 = pd.DataFrame(columns=["New Column"])
#iterate through the rows of df1
for idx, row in df1.iterrows():
#iterate through the possible languages
for lang in langCodes[row['Country']]:
#iterate through the possible words
for word in l:
#translate the word using googletrans
getTrans = translator.translate(word, dest=lang).text
#proxies to use comment the line above and uncomment the two lines below
#proxyIdex = idx % len(proxiesArray)
#getTrans = translator.translate(word, proxy = proxiesArray[proxyIdex],dest=lang).text
#append output to new column
df2 = df2.append({"New Column":row['Name']+" "+getTrans},ignore_index=True)
print(df2)
Sample output:
New Column
0 Tom spring
1 Tom summer
2 Tom fall
3 Tom winter
4 Sam printemps
5 Sam été
6 Sam tomber
7 Sam hiver
8 Tim 弹簧
9 Tim 夏季
10 Tim 秋季
11 Tim 冬季
12 Andrew 春
13 Andrew 夏
14 Andrew 秋
15 Andrew 冬
16 Bess bahar
17 Bess yaz
18 Bess sonbahar
19 Bess kış
20 Sara primăvară
21 Sara vară
22 Sara toamna
23 Sara iarnă
24 Jeff Frühling
25 Jeff Sommer-
26 Jeff fall
27 Jeff winter
28 Jeff printemps
29 Jeff été
30 Jeff tomber
31 Jeff hiver
As you can see "Jeff" has both German and French responses. Additionally if the input list is very large you can consider using hyper as it can speed up translation according to googletrans Documentation.
Update
I added proxies to the answer for large translations. To use uncomment the specific lines in the code. Beware that using proxies slows down the translations but when I tested it with 6 l
words and 1500 entries in df1
it completed without error. The more proxies added in proxiesArray
should increase the translation capacity.