0

Does anyone know how to change the language code textcat gives as output. My real world output looks something like this:

enter image description here

Reproducible example:

library(textcat)
library(tidyverse)

df <- data.frame(
  text = c("Das ist deutsch","This is english", "C'est francais")
)

df <- df |>
  mutate(
    lang_textcat = textcat(text)
  )

## iso639-1 code 
# german == de
# english == en
# french == fr

df <- df |>
  mutate(
    lang_iso = c("de","en","fr")
  )

What I get from textcat you see in column lang_textcat. But what I want is the output like in column lang_iso. Is there an option to change the output to ISO 639-1? I could manually recode it, but it would be great, if there is an built-in option.

textcat package: https://cran.r-project.org/web/packages/textcat/textcat.pdf

Thanks!

  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Aug 15 '23 at 16:03

0 Answers0