Highest Voted 'nltk-trainer' Questions

21

votes

2 answers

What is the preferred ratio between the vocabulary size and embedding dimension?

When using for example gensim, word2vec or a similar method for training your embedding vectors I was wonder what is a good ratio or is there a preferred ratio between the embedding dimension to vocabulary size ? Also how does that change with more…

asked Jan 27 '18 at 19:50

Gabriel Bercea

1,191
1
10
21

7

votes

1 answer

How to handle with words which have space between characters?

I am using nltk.word_tokenize in Dari language. The problem is that we have space between one word. For example the word "زنده گی" which means life. And the same; we have many other words. All words which end with the character "ه" we have to give a…

python regex nltk nltk-trainer nltk-book

asked Sep 20 '17 at 09:29

The Afghan

99
1
7

6

votes

0 answers

Error when installing nltk packages on heroku

I am trying to install nltk packages in heroku using nltk.txt file. int my nltk.txt file only punkt is written. In requirements.txt file nltk is written. but when push it it shows the errors. Please help to fix my problem remote: -----> Python app…

python flask heroku nltk nltk-trainer

asked Nov 19 '18 at 17:51

Tulshi Das

480
3
18

6

votes

1 answer

NLTK - Download all nltk data except corpara from command line without Downloader UI

We can download all nltk data using: > import nltk > nltk.download('all') Or specific data using: > nltk.download('punkt') > nltk.download('maxent_treebank_pos_tagger') But I want to download all data except 'corpara' files, for example - all…

python nlp nltk corpus nltk-trainer

asked Jun 25 '16 at 16:46

RAVI

3,143
4
25
38

5

votes

1 answer

Laplace smoothing function in nltk

I'm building a text generate model using nltk.lm.MLE, I notice they also have nltk.lm.Laplace that I can use to smooth the data to avoid a division by zero, the documentation is https://www.nltk.org/api/nltk.lm.html However, there's no clear…

nlp nltk nltk-trainer

asked Mar 11 '20 at 07:56

MeiNan Zhu

1,021
1
9
18

5

votes

2 answers

How to train NLTK PunktSentenceTokenizer batchwise?

I am trying to split financial documents to sentences. I have ~50.000 documents containing plain English text. The total file size is ~2.6 GB. I am using NLTK's PunktSentenceTokenizer with the standard English pickle file. I additionally tweaked it…

python nltk nltk-trainer

asked Sep 03 '18 at 12:41

JumpinMD

53
6

5

votes

1 answer

Python NLTK visualization

I am currently doing natural language processing using python NLTK. I want to generate some beautiful graphics of the representation of input. What package can I do to get something like this?

python nltk nltk-trainer

asked Feb 24 '17 at 00:15

wrek

1,061
5
14
26

3

votes

1 answer

No module named 'nltk.lm' in Google colaboratory

I'm trying to import the NLTK language modeling module (nltk.lm) in a Google colaboratory notebook without success. I've tried by installing everything from nltk, still without success. What mistake or omission could I be making? Thanks in…

google-cloud-platform nlp nltk nltk-trainer

asked Nov 25 '21 at 18:17

Ramiro Hum-Sah

132
1
6

3

votes

1 answer

Is it possible to modify and run only part of a Python program without having to run all of it again and again?

I have written a Python code to train Brill Tagger from NLTK library on some 8000 English sentences and tag some 2000 sentences. The Brill Tagger takes many, many hours to train and finally when it finished training, the last statement of the…

python nltk pos-tagger nltk-trainer

asked Jan 20 '18 at 20:19

singhuist

302
1
6
17

2

votes

0 answers

NLTK: How to define the "labeled_featuresets" when creating a ClassifierBasedTagger with nltk?

I am playing around with the nltk right now. I am trying to create various Classifiers with nltk, doing named entity recognition, to compare their results. Creating n-gram Taggers was easy, however I have run into some issues creating a…

python classification nltk named-entity-recognition nltk-trainer

asked Mar 19 '19 at 14:52

Malonga

35
5

2

votes

2 answers

nltk.org example of Sentence segmentation with Naive Bayes Classifier: how does .sent separate sentences and how does the ML algorithm improve it?

There is an example in nltk.org book (chapter 6) where they use a NaiveBayesian algorithm to classify a punctuation symbol as finishing a sentence or not finishing one... This is what they do: First they take a corpus and use the .sent method to…

nlp nltk naivebayes sentence nltk-trainer

asked Dec 16 '18 at 02:24

Martin

414
7
21

2

votes

2 answers

Finding matching words with ngrams

Dataset: df['bigram'] = df['Clean_Data'].apply(lambda row: list(ngrams(word_tokenize(row), 2))) df[:,0:1] Id bigram 1952043 [(Swimming,Pool),(Pool,in),(in,the),(the,roof),(roof,top), 1918916 …

python python-3.x pandas nltk nltk-trainer

asked Aug 27 '17 at 06:05

Rajitha Naik

103
2
11

2

votes

1 answer

Python 2.x - How to get the result of the NLTK Naive Bayes classification through a trainSet and a testSet

I'm building a text parser to identify types of crime that contain the texts. My class was built to load the texts of 2 csv files (one file to train and one file to test). The way it was built the methods in my class are for, to make a rapid…

python python-2.7 nltk naivebayes nltk-trainer

asked Apr 10 '17 at 13:55

Leandro Santos

67
1
1
10

2

votes

3 answers

How to add a custom corpora to local machine in nltk

I have a custom corpora that created with data which i need to do some classification. I have the dataset in a same format that movie_reviews corpora contains. According to nltk documentation i use following code to access to movie_reviews corpora.…

python nltk nltk-trainer

asked Feb 11 '17 at 13:56

Janitha

65
9

2

votes

1 answer

How to remove nltk from python and from my system and also from command prompt

I tried downloading nltk by using the command on the python command prompt import nltk nltk.download() //after this it started downloading Now I want to delete all the nltk files from my system, please help in uninstalling and removing all the…

python-3.x command-line nltk nltk-trainer

asked Jan 27 '17 at 14:24

ChÃrming ßoy

13
1
7

Questions tagged [nltk-trainer]