Highest Voted 'nltk-book' Questions

7

votes

1 answer

How to handle with words which have space between characters?

I am using nltk.word_tokenize in Dari language. The problem is that we have space between one word. For example the word "زنده گی" which means life. And the same; we have many other words. All words which end with the character "ه" we have to give a…

asked Sep 20 '17 at 09:29

The Afghan

99
1
7

4

votes

1 answer

How can I find a specific bigram using nltk in python?

I am currently working with nltk.book iny Python and would like to find the frequency of a specific bigram. I know there is the bigram() function that gives you the most common bigrams in the text as in this code: >>> list(bigrams(['more', 'is',…

python nltk frequency nltk-book

asked Nov 14 '20 at 15:13

Jennifer

47
4

4

votes

1 answer

nltk "OMW" wordnet with Arabic language

I'm working on python/nltk with (OMW) wordnet specifically for The Arabic language. All the functions work fine with the English language yet I can't seem to be able to perform any of them when I use the 'arb' tag. The only thing that works great is…

python python-2.7 nltk wordnet nltk-book

asked Jul 18 '17 at 03:27

user2340286

69
1
6

2

votes

1 answer

How to use edit_distance() from nltk.metrics in this example?

I have a bit of problem with using edit_distance() in the following example. I need to print words from the languages mentioned in the languages list in 5 columns, which is not a problem. I have done that: from nltk.corpus import swadesh from…

python python-3.x nltk nltk-book

asked Jun 08 '20 at 15:29

White

51
6

2

votes

1 answer

Porter and Lancaster stemming clarification

I am doing stemming using Porter and Lancaster and I find these observations: Input: replied Porter: repli Lancaster: reply Input: twice porter: twice lancaster: twic Input: came porter: came lancaster: cam Input: In porter: …

nlp nltk stemming porter-stemmer nltk-book

asked Feb 25 '20 at 03:55

floss

2,603
2
20
37

2

votes

1 answer

When building Feature based grammar, why do I get "invalid syntax" error?

Why do I get "invalid syntax" in the line with the % start S? nltk.data.show_cfg('grammars/book_grammars/feat0.fcfg') % start S S -> NP[NUM=?n] VP[NUM=?n] # NP expansion productions NP[NUM=?n] -> PropN[NUM=?n] NP[NUM=?n] -> Det[NUM=?n]…

python-3.x nltk grammar feature-extraction nltk-book

asked Dec 21 '18 at 11:24

nefeli

31
1

2

votes

0 answers

span_tokenize gives generator object as output

I have written a simple piece of code to see exactly how the span_tokenize function works. Documentation for this can be found here: http://www.nltk.org/api/nltk.tokenize.html Here is my piece of code import nltk from nltk.tokenize.api import…

nltk tokenize text-mining stringtokenizer nltk-book

asked Mar 15 '18 at 02:56

Shardul Pendharkar

137
1
11

2

votes

0 answers

Setting up ntlk proxy

I was following first chapter of the nltk book. It asks us to install book corpus by running nltk.dowwnload(). I am getting getattrinfo failed error while doing ntlk.download(). After reading online, I came to know that this has something to do…

python python-3.x nlp nltk nltk-book

asked Mar 06 '18 at 12:29

Mahesha999

22,693
29
116
189

2

votes

1 answer

Best way to understand the input text before applying ngram

Currently I am reading text from excel file and applying bigram to it. finalList has list used in below sample code has the list of input words read from input excel file. Removed the stopwords from input with help of following library: from…

python-3.x pandas nlp nltk nltk-book

asked Oct 09 '17 at 07:25

Pyd

6,017
18
52
109

1

vote

1 answer

Building a Character-Level Ngram Language Model with NLTK

I'm trying to build a language model on the character level with NLTK's KneserNeyInterpolated function. What I have is a frequency list of words in a pandas dataframe, with the only column being it's frequency (the word itself is the index). I've…

python nlp nltk n-gram nltk-book

asked Jul 31 '21 at 20:14

JaP

87
6

1

vote

1 answer

Conditional Frequency Distribution

Hi :) I am really new to Python and NLP and now trying to go through the NLTK book from O'Reilly. I'm currently at a dead set with the task concerning plotting and tabulating with Conditional Frequency Distribution. The task is the following: "find…

nlp nltk nltk-book

asked Jul 28 '21 at 22:52

k_bedryk

11
1

1

vote

0 answers

Change name of any state, county, regions, or their abbreviations to country name in python NLTK or other packages

I have a list of locations that is mixed with states, cities and countries, counties and regions, in abbreviations and some in full. For instance, NY, CA, England, UK, USA, Minnesota, London, Bradford, etc. I want it all to be converted to countries…

python nltk nltk-trainer nltk-book pycountry-convert

asked Jan 22 '21 at 02:40

Julius Sechang Mboli

60
8

1

vote

0 answers

What is the more natural parsing, the one that leads to the preferred reading of the sentence

I have those rules: and those two possible parse trees: I am asked for the next question: What is the more natural parsing, the one that leads to the preferred reading of the sentence? Can anyone explain to me, what is more natural in English and…

nlp nltk stanford-nlp linguistics nltk-book

asked Jan 05 '21 at 14:30

Ilya.K.

291
1
13

1

vote

0 answers

How to go from type theory to first-order logic lambda-expressions

As can be seen in the O'Reilly NLTK book, Chapter 10, when I want to model the syntax tree of sentence “Bob loves Alice,” namely into first-order logic lambda-expressions, I get the following: where on the left I have the tree of types and on the…

types nlp nltk formal-semantics nltk-book

asked Mar 04 '20 at 22:11

yannis

819
1
9
26

1

vote

2 answers

'word' not in Vocabulary in a corpus with words shown in a single list only in gensim library

Hello Community Members, At present, I am implementing the Word2Vec algorithm. Firstly, I have extracted the data (sentences), break and split the sentences into tokens (words), remove the punctuation marks and store the tokens in a single list. The…

python-3.x nltk gensim word2vec nltk-book

asked Aug 21 '18 at 09:23

M S

894
1
13
41

Questions tagged [nltk-book]