Questions tagged [cld2]

6 questions
3
votes
1 answer

Split string into segments according to the alphabet

I want to split the given string into alphabet segments that the string contains. So for example, if the following string is given: Los eventos automovilísticos comenzaron poco después de la construcción exitosa de los primeros automóviles a…
Sirojiddin Komolov
  • 771
  • 10
  • 17
3
votes
0 answers

Converting Language Detection Score of CLD2 to CLD3 Accuracy

My cld2 language detection model (langID) returns for the input sentence to classify the following values { reliable: true, textBytes: 181, languages: [ { name: 'ITALIAN', code: 'it', percent: 61, score: 774 }, { name: 'ENGLISH', code:…
loretoparisi
  • 15,724
  • 11
  • 102
  • 146
2
votes
0 answers

Language detection using pycld2

I am trying to use the pycld2 package to detect multiple languages in text. This is the example I am testing out: import pycld2 as cld2 text = '''The universal connection with an additional advantage: Push-in connection. Terminate solid and…
2
votes
0 answers

Is there a way in Polyglot to permanently "fix" the language code of an Hebrew text from ''iw'' to ''he''?

I want to make a simple sentiment analysis on a Hebrew text using Polyglot in python 3.6. The problem is that Polyglot recognizes the text language code as "iw" and not as "he", and therefore is not able to process it. As shown at: use polyglot…
yotam
  • 21
  • 2
0
votes
0 answers

Adding new language mappings in CLD2

We're looking to add languages and pseudo-languages to CLD2. Mostly to support Romanized forms like Hinglish (Hindi in Latin script) or translit (Cyrillic strings using Latin script), but not only. (Yes, we know CLD3 supports these; it's not…
Vadim Berman
  • 1,932
  • 1
  • 20
  • 39
0
votes
0 answers

How to get the significant differences with "cld" function?

library(dplyr) library(tidyr) library(ggpmisc) library(jmv) library(Rmisc) library(emmeans) library(multcomp) library(Hmisc) library(lattice) library(multcompView) library(agricolae) …