1

Thank you for looking at my question, and happy new year!

My problem/question: I have a dataframe column containing a list of names like this (some also repeating):

Name (German)
Josef
Georg
Mathilde
Josef
Ludwig
Lorenz
Georg
... 

And I want to e.g. convert these names to their English counterpart, with the corresponding names in German and English being contained in another reference file/dataframe like this:

Name (German)    Name (English)
Mathilde         Mathilda
Georg            George
Lorenz           Lawrence
Josef            Joseph
Ludwig           Lewis
...  

So that at the end, my dataframe with a new column would look like this:

Name (German)    Name (English)
Josef            Joseph
Georg            George
Mathilde         Mathilda
Josef            Joseph
Ludwig           Lewis
Lorenz           Lawrence
Georg            George
... 

If anyone knows how to accomplish this, I would be very grateful if you could help me figure out how to do it. In any case, thank you for any help!

With best regards!

Phil
  • 7,287
  • 3
  • 36
  • 66

1 Answers1

1

Writing a dictionary using read.table().

dict <- read.table(header=TRUE, text='
German   English
Mathilde         Mathilda
Georg            George
Lorenz           Lawrence
Josef            Joseph
Ludwig           Lewis
')

and then match().

transform(dat, English=dict[match(dat$Name, dict$German), ]$English)
#       Name  English
# 1    Josef   Joseph
# 2    Georg   George
# 3 Mathilde Mathilda
# 4    Josef   Joseph
# 5   Ludwig    Lewis
# 6   Lorenz Lawrence
# 7    Georg   George

Data:

dat <- structure(list(Name = c("Josef", "Georg", "Mathilde", "Josef", 
"Ludwig", "Lorenz", "Georg")), class = "data.frame", row.names = c(NA, 
-7L))
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • 1
    @DionGroothof Leave that one to a ggplot user, I'm using low level plotting. But you could [accept this answer](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work/5235#5235). – jay.sf Jan 24 '22 at 16:05
  • Thank you for this and your fast help! Would this system also work for directly loaded dataframes? I have stored my list of names in two xlsx files, and loaded them as dataframes directly into R, so I did not create my files like you did e.g. with the structure(list()) etc commands. One xlsx file just contains my names (first column one language, second column translation; overall several thousand names) and the file I want to convert has tens of thousands of names in a single column. So typing out every single name like in your example would make my script somewhat inflated. – LawrenceThundersthorp Jan 24 '22 at 16:24
  • @LawrenceThundersthorp Yes should work, just replace `dat` with the name of your original data and `dict` with the name of your loaded dictionary data. The `structure` thing is the output from `dput()` which is the way we provide data in the R tag, read our [tutorial](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – jay.sf Jan 24 '22 at 16:27
  • Again, thank you very much! The script seems to be running, but my dataframe is still somehow not changed. – LawrenceThundersthorp Jan 24 '22 at 17:00
  • @LawrenceThundersthorp did you assign the output to your data frame? Means: `dat <- transform(.)` – jay.sf Jan 24 '22 at 17:01
  • 1
    Thanks! My problem was not with your script, but with a part of my data. I have adjusted it now and everything works. Thank you for your help! – LawrenceThundersthorp Jan 24 '22 at 17:30
  • @LawrenceThundersthorp Great, well done, happy to help! – jay.sf Jan 24 '22 at 17:32