0

I'm trying to translate the contents of a dataframe. With the following code, I manage to translate one of the many rows that should be translated:

from deep_translator import GoogleTranslator

data_en = data.copy(deep=True)

data_en.gender = data_en.gender.apply(lambda x: GoogleTranslator(source='auto', target='en').translate(x))

data_en.head()

Although I could just repeat the code adding the other rows manually, I think a loop would save a lot of time in this process.

I have try not putting any column name in the code, hoping it would translate the whole data frame, but it won't work.

How can I use a loop to translate the entire Dataframe?

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
  • maybe create `GoogleTranslator(source='auto', target='en')` only once and later use the same instance in apply. And creating normal loop may be much slower. – furas Mar 08 '22 at 00:02
  • do you really have to make `copy()` ? maybe it would be faster to use current `data` and assign results to new column - `data["EN"] = data.gender.apply(...)` – furas Mar 08 '22 at 00:05
  • do you want translate `"many rows"` or `"many columns"` ? – furas Mar 08 '22 at 00:10
  • if you want to convert all columns then it work `data_en = data_en.apply(...)` and if you want to convert some columns then `for col in ["gender", "other"]: data_en[col] = data_en[col].apply(...)`, or `data_en[ ["gender", "other"] ] = data_en[ ["gender", "other"] ].apply(...)` – furas Mar 08 '22 at 00:14
  • Welcome back to Stack Overflow. As a refresher, please read [ask] and note that this is *not a discussion forum*. – Karl Knechtel Mar 08 '22 at 00:30

1 Answers1

1

First: I would create GoogleTranslator() only once.

And you don't need to use lambda x: if translate gets only one value.

gt = GoogleTranslator(source='auto', target='en')

data_en.gender = data_en.gender.apply(gt.translate)

If you want to translate some columns then you can use for-loop

for col in ["gender", "other"]:
    data_en[ col ] = data_en[ col ].apply(gt.translate)

Or you can use applymap

data_en[ ["gender", "other"] ] = data_en[ ["gender", "other"] ].applymap(gt.translate)

which works also for all cells

data_en = data_en.applymap(gt.translate)

And if you need to do something more complex in row then you can use axis=1

(This example can be done in simpler way but I have no better example)

def convert(row):
    row['gender'] = gt.translate(row['gender'])
    # first char from gender
    row['other'] = row['gender'][0]
    return row

data_en = data_en.apply(convert, axis=1)

Minimal working example

import pandas as pd
from deep_translator import GoogleTranslator

df = pd.DataFrame({
  'Gender': ['mężczyna', 'kobieta'],
  'Other' : ['pies', 'kot'],
#  'Number': [1, 2],
})

print(df)

gt = GoogleTranslator(source='pl', dest='en')

# -----------------

df_en = df.copy()

for col in ['Gender', 'Other']:
    df_en[col] = df_en[col].apply(gt.translate)

print(df_en)

# -----------------

df_en = df.copy()

df_en[ ['Gender', 'Other'] ] = df_en[ ['Gender', 'Other'] ].applymap(gt.translate)

print(df_en)

# -----------------

df_en = df.copy()

df_en = df_en.applymap(gt.translate)

print(df_en)

# -----------------

df_en = df.copy()

def convert(row):
    row['Gender'] = gt.translate(row['Gender'])
    # first char from gender = `F` or `M`
    row['Other'] = row['Gender'][0]
    return row

df_en = df_en.apply(convert, axis=1)

print(df_en)
furas
  • 134,197
  • 12
  • 106
  • 148
  • Thank you, it helped a lot! But since I have many columns, how could I efficiently translate all of them without having to write every single them into the dataframe? – marcusaureliusantonius Mar 08 '22 at 11:58
  • where do you have these columns? It may be simpler to write all in dataframe and run sinlge `applymap` than write own code for all of this. Especially that some parts in pandas is created in C/C++. But you can always try to use `for`-loop and use `open()`, `read()` to read from file. But I don't know which version will be more efficient.You would have to write code and run it to check speed. – furas Mar 08 '22 at 14:04
  • Oh, thank you! I have managed to do it! – marcusaureliusantonius Mar 11 '22 at 09:14