Questions tagged [textacy]

Reference Site: https://textacy.readthedocs.io/en/stable/

Features

  • Stream text, json, csv, and spaCy binary data to and from disk
  • Clean and normalize raw text, before analyzing it
  • Explore a variety of included datasets, with both text data and metadata
  • from Congressional speeches to historical literature to Reddit comments
  • Access and filter basic linguistic elements, such as words and ngrams, noun chunks and sentences
  • Extract named entities, acronyms and their definitions, direct quotations, key terms, and more from documents
  • Compare strings, sets, and documents by a variety of similarity metrics
  • Transform documents and corpora into vectorized and semantic network representations
  • Train, interpret, visualize, and save sklearn-style topic models using LSA, LDA, or NMF methods
40 questions
5
votes
2 answers

Calculate TD-IDF for a single word in Textacy

I'm trying to use Textacy to calculate the TF-IDF score for a single word across the standard corpus, but am a bit unclear about the result I am receiving. I was expecting a single float which represented the frequency of the word in the corpus. So…
port5432
  • 5,889
  • 10
  • 60
  • 97
4
votes
2 answers

My question is about "module 'textacy' has no attribute 'Doc'"

Can't find module 'textacy' has no attribute 'Doc' I am trying to extract verb phrases from spacy but there is such no library. Please help me how can I extract the verb phrases or adjective phrases using spacy. I want to do full shallow…
Gul Jabeen
  • 86
  • 1
  • 4
3
votes
1 answer

Create subject-verb-object model of complex, fragmented sentences from police reports

I am fairly new to spacy / textacy and I have a complicated task ahead. Your help is much appreciated. In a nutshell, from a sentence like "Did assault paramedic by kicking and pushing him", I want to establish whether the reported abuse was against…
Kristin
  • 31
  • 1
  • 2
3
votes
1 answer

multiprocessing with textacy or spacy

I am trying to speed up processing of large lists of texts via parallelisation of textacy. When I use Pool from multiprocessing the resulting textacy corpus comes out empty. I am not sure if the problem is in the way I use textacy or multiprocessing…
Diego
  • 812
  • 7
  • 25
3
votes
1 answer

More efficient implementation of Textacy / spacy 'subject_verb_object_triples'

I'm trying to implement the 'extract.subject_verb_object_triples' funcation from textacy on my dataset. However, the code I have written is very slow and memory intensive. Is there a more efficient implementation? import spacy import textacy def…
W.R
  • 187
  • 1
  • 1
  • 14
3
votes
2 answers

How to initialize a `Doc` in textacy 0.6.2?

Trying to follow the simple Doc initialization in the docs in Python 2 doesn't work: >>> import textacy >>> content = ''' ... The apparent symmetry between the quark and lepton families of ... the Standard Model (SM) are, at the very least,…
arturomp
  • 28,790
  • 10
  • 43
  • 72
3
votes
1 answer

Using spacy and textacy. Need to find tf-idf score across corpus of original tweets but cant import textacy vectorizer

I'm new to these frameworks as well as NLP. I am following an example which gives me the following code snippet to calculate the tf-idf score of all the tokens in the tweets. However I keep getting either import errors or Vectorizer undefined.…
aldmarj
  • 61
  • 1
  • 7
3
votes
1 answer

Textacy with Jupyter Notebook: How to suppress multiple error warnings?

I am using Textacy (on top of Spacy) to process many snippets of text. Specifically I use Textacy´s Readability scores. Since I have a lot of short texts I get a warning that I need to suppress because it otherwise will crash my notebook. My…
petezurich
  • 9,280
  • 9
  • 43
  • 57
2
votes
0 answers

Find topic weight in part of the corpus

I am doing topic modeling with tweets on Python. I am working on two time periods. I want to extracts topics with Spacy's textacy training the model on the corpus of both the time periods. Then, I want to analyse the weight of the topics on the…
s12345
  • 21
  • 1
2
votes
1 answer

How to extract verb phrases today?

For a project on NLP I need to extract verb phrases from a list of sentences. I have read some older posts from StackOverflow and watched this video. All was very helpful in understanding my problem and learning about possible patterns, but all code…
Sam V
  • 479
  • 1
  • 4
  • 11
2
votes
1 answer

Spacy/Textacy not reading file contents from .txt (text) file

I am trying to read the contents (blog) from a text file using Python (SpaCy/Textacy/Textblob) but it has been in vain, so far. Following is the code that I have recently tried: import content as content import pattern as pattern import…
2
votes
0 answers

unable to install textacy in python 3.0

I am trying to install textacy to perform NLP tasks, but getting an error while trying to do: pip install textacy in Anaconda prompt. The error I am getting is error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build…
1
vote
1 answer

Extract quotations and attribution from text

I am attempting to extract quotations and quotation attributions (i.e., the speaker) from text, but I am not obtaining the desired output. I am using textacy. Here is what I have tried so far: import textacy from textacy import extract from…
jedmund
  • 55
  • 4
1
vote
1 answer

module 'thinc' has no attribute 'layers'

I am following this article for my work and in this article, under heading Verb Phrase Detection, I am following the instructions but after successfully installing the textacy library (It shows in pip list) when I use import textacy in jupyter…
ankit
  • 277
  • 1
  • 4
  • 25
1
vote
0 answers

spacy/textacy: subject_verb_object_triples(doc) not returning any triplets

My goal is to extract SVO-triplets from simple sentences. For example for the sentence "A person is standing in a kitchen making a sandwich." I want the output and . I tried to use spacy/textacy…
josch14
  • 56
  • 4
1
2 3