Questions tagged [cld2]
6 questions
3
votes
1 answer
Split string into segments according to the alphabet
I want to split the given string into alphabet segments that the string contains. So for example, if the following string is given:
Los eventos automovilísticos comenzaron poco después de la construcción exitosa de los primeros automóviles a…

Sirojiddin Komolov
- 771
- 10
- 17
3
votes
0 answers
Converting Language Detection Score of CLD2 to CLD3 Accuracy
My cld2 language detection model (langID) returns for the input sentence to classify the following values
{ reliable: true,
textBytes: 181,
languages:
[ { name: 'ITALIAN', code: 'it', percent: 61, score: 774 },
{ name: 'ENGLISH', code:…

loretoparisi
- 15,724
- 11
- 102
- 146
2
votes
0 answers
Language detection using pycld2
I am trying to use the pycld2 package to detect multiple languages in text. This is the example I am testing out:
import pycld2 as cld2
text = '''The universal connection with an additional advantage: Push-in connection. Terminate solid and…

natt010
- 43
- 4
2
votes
0 answers
Is there a way in Polyglot to permanently "fix" the language code of an Hebrew text from ''iw'' to ''he''?
I want to make a simple sentiment analysis on a Hebrew text using Polyglot in python 3.6.
The problem is that Polyglot recognizes the text language code as "iw" and not as "he", and therefore is not able to process it.
As shown at:
use polyglot…

yotam
- 21
- 2
0
votes
0 answers
Adding new language mappings in CLD2
We're looking to add languages and pseudo-languages to CLD2. Mostly to support Romanized forms like Hinglish (Hindi in Latin script) or translit (Cyrillic strings using Latin script), but not only. (Yes, we know CLD3 supports these; it's not…

Vadim Berman
- 1,932
- 1
- 20
- 39
0
votes
0 answers
How to get the significant differences with "cld" function?
library(dplyr)
library(tidyr)
library(ggpmisc)
library(jmv)
library(Rmisc)
library(emmeans)
library(multcomp)
library(Hmisc)
library(lattice)
library(multcompView)
library(agricolae)
…