Questions tagged [udpipe]

UDPipe comprises a free C++ library and a binary executable for Natural Language Processing (NLP).

UDPipe is a free C++ library for Natural Language Processing (NLP). UDPipe can do tokenization, parts-of-speech tagging, lemmatization and dependency parsing of raw text.

Binaries for Windows/Linux/OS X are also available, and there exist a web service and a REST API.

For details see http://ufal.mff.cuni.cz/udpipe and https://github.com/ufal/udpipe .

37 questions

votes

3 answers

Make udpipe_annotate() faster

I am currently working on a Text Mining document, where I want to abstract relevant keywords from my text (note that I have got many, many text documents). I am using the udpipe package. A great Vignette is online on…

r keyword tm udpipe

asked Nov 27 '18 at 13:56

R overflow

1,292
2
17
37

votes

1 answer

How to make "words clustering" in R with udpipe package?

I am using udpipe package in R to make some text mining. I have followed this tutorial : https://cran.r-project.org/web/packages/udpipe/vignettes/udpipe-usecase-postagging-lemmatisation.html#nouns__adjectives_used_in_same_sentence but now, I am a…

r cluster-analysis text-mining udpipe

asked Mar 24 '18 at 12:51

MysteryGuy

1,091
2
18
43

votes

1 answer

udpipe_annotate() in r labels the same word differently if followed by punctuation

I'm doing a standard topic modelling task on nouns in newspaper articles using udpipe to annotate the article content. Using the function udpipe_annotate() I noticed that words together with the following punctuation mark sometimes were labelled as…

r nlp annotations punctuation udpipe

asked Jul 25 '22 at 08:17

Hal

votes

0 answers

NLP in R: working with tokenization in a CONLLU-style dataframe

I am working in a Portuguese Digital Humanities project using R. I created a CONLLU-style dataframe with the corpus data, using the UDPipe library: textAnnotated <- udpipe::udpipe_annotate(m_port, x = textCorpus) %>% as.data.frame() The beginning…

r nlp tokenize udpipe conll

asked Jun 02 '22 at 16:53

Bruno Maroneze

votes

1 answer

udpipe (keywords_rake) how to link keywords to the document they where extracted from

I am using the function keywords_rake from the udpipe package (for R) to extract keywords from a bunch of documents. udmodel_en <- udpipe_load_model(file = dl$file_model) x <- udpipe_annotate(udmodel_en, x = data$text) x <-…

r nlp udpipe

asked Jan 27 '20 at 16:04

Carbo

vote

0 answers

How to run the R RAKE function in udpipe across individual groups

Given the following sample data frame: Question <- c("Q1", "Q1", "Q1","Q1","Q2", "Q2", "Q2","Q2") Answer <- c("I like to be creative when I cook with crock pots.","I like to be creative when I cook with crock pots.", "I like to be…

r nlp udpipe

asked Apr 15 '21 at 19:36

Mark P.

1,827
16
37

vote

1 answer

R extract most common word(s) / ngrams in a column by group

I wish to extract main keywords from the column 'title', for each group (1st column). Desired result in column 'desired title': Reproducible data: myData <- structure(list(group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3,…

tm topic-modeling n-gram udpipe textrank

asked Sep 11 '20 at 03:35

Yeshyyy

vote

2 answers

spacy-udpipe with pytextrank to extract keywords from non-English text

I've been using pytextrank (https://github.com/DerwenAI/pytextrank/) with spacy and English models for keywords extraction - it works great! Now I need to process non-English texts and I found udpipe (https://github.com/TakeLab/spacy-udpipe) but it…

python nlp spacy udpipe pytextrank

asked Jan 20 '20 at 13:33

Jan Mazanec

vote

0 answers

Topic Modelling by Group using LDA in R

I am stuck at one problem. I am trying to categorize sentences into topics using LDA. I have done it, however the problem is: LDA is working on whole dataset and giving me topic terminologies across the dataset. I want to get the topic terminologies…

r lda topic-modeling udpipe

asked Nov 20 '19 at 08:27

Rana Usman

1,031
7
21

vote

1 answer

How to get future tense for a verb with udpipe

I have a large number of medical reports. I am trying to determine sentences that show a future action will be taken eg 'I will prescribe a medication' I am using english-ewt model from udpipe and I have also tried english-gum but neither give me a…

r udpipe

asked Mar 08 '19 at 16:32

Sebastian Zeki

6,690
11
60
125

vote

1 answer

R - Parsing keywords from udpipe RAKE per article back to dataframe

I'm attempting to use udpipe's RAKE to generate a list of 25 RAKE tokens per document in a dataframe and write those tokens (plus a simple str_count) back to the dataframe. I constructed a for loop to handle, but instead I'm writing the same result…

r nlp udpipe

asked Feb 10 '19 at 01:00

Christopher Penn

vote

0 answers

Text Mining responses with very varying answer lengths

I have a dataset of responses where people were requested to answer a set of questions. There's only one column of text data to process. My challenge is; only very few respondents have actually written long texts that I found easy to process and…

text nlp analytics sentiment-analysis udpipe

asked Jan 04 '19 at 13:55

Dinesh

vote

1 answer

inherits(x, "character") is not TRUE in R programming Shiny App

I am creating Shiny App and the purpose is to input text file and using udpipe library need to create wordcloud, annoate etc... I am getting "inherits(x, "character") is not TRUE" when running the app. The problem comes from "Annotate" Tab as i am…

r shiny udpipe

asked May 28 '18 at 01:52

Balaji Venky

vote

1 answer

Is it possible to modify spaCy by udpipe within the Rasa-NLU?

I am several days testing Rasa-NLU, which internally uses spaCy. I had a great disappointment about the Portuguese language. Trying to figure out how to improve the training data, I found an excellent script comparing spaCy with udpipe that can be…

rasa-nlu udpipe

asked Apr 10 '18 at 22:56

luisdemarchi

1,402
19
29

vote

2 answers

Find words in a corpus based on lemma

I am doing text mining with R and I get an "issue" I would like to solve... In order to find the reports in corpus that contain the most a given word or expression, I use kwicfunction from quantedapackage like this : result <- kwic…

r text-mining quanteda udpipe

asked Apr 07 '18 at 12:26

MysteryGuy

1,091
2
18
43

2 3 Next