Questions tagged [spacy-3]

For questions specific to spaCy version 3, an Industrial-Strength Natural Language Processing tool in Python. Use the more generic tag `spacy` for general questions about the spaCy.

334 questions
12
votes
3 answers

Cant load spacy en_core_web_trf

As the self guide says, I've installed it with (conda environment) conda install -c conda-forge spacy python -m spacy download en_core_web_trf I have spacy-transformers already installed. But when I simply do: import…
Omar
  • 1,029
  • 2
  • 13
  • 33
9
votes
1 answer

Difference between spacy v3 en_core_web_trf pipeline and en_core_web_lg pipeline

I am doing some performance tests with spacy version 3 for right sizing my instances in production. I am observing the following Observation: Model name Time without NER Time with NER Comments en_core_web_lg 4.89 seconds 21.9 seconds NER…
ryk
  • 133
  • 2
  • 9
7
votes
2 answers

SpaCy: Set entity information for a token which is included in more than one span

I am trying to use SpaCy for entity context recognition in the world of ontologies. I'm a novice at using SpaCy and just playing around for starters. I am using the ENVO Ontology as my 'patterns' list for creating a dictionary for entity…
hrshd
  • 941
  • 2
  • 13
  • 19
7
votes
2 answers

Spacy 3 Confidence Score on Named-Entity recognition

I need to get a confidence score for the tags predicted by NER 'de_core_news_lg' model. There was a well known solution to the problem in the Spacy 2: nlp = spacy.load('de_core_news_lg') doc = nlp('ich möchte mit frau Mustermann in der Musterbank…
Keyvan Sadri
  • 71
  • 1
  • 2
6
votes
1 answer

ValueError: [E143] Labels for component 'tagger' not initialized

I've been following this tutorial to create a custom NER. However, I keep getting this error: ValueError: [E143] Labels for component 'tagger' not initialized. This can be fixed by calling add_label, or by providing a representative batch of…
Diana
  • 363
  • 2
  • 8
6
votes
2 answers

How to get a description for each Spacy NER entity?

I am using Spacy NER model to extract from a text, some named entities relevant to my problem, such us DATE, TIME, GPE among others. For example, I need to recognize the Time Zone in the following sentence: "Australian Central Time" With Spacy…
Emiliano Viotti
  • 1,619
  • 2
  • 16
  • 30
6
votes
3 answers

AttributeError: module 'spacy' has no attribute 'load'

import spacy nlp = spacy.load('en_core_web_sm') **Error:** Traceback (most recent call last): File "C:\Users\PavanKumar\.spyder-py3\ExcelML.py", line 27, in nlp = spacy.load('en_core_web_sm') AttributeError: module 'spacy' has no…
Arvind
  • 71
  • 1
  • 4
6
votes
1 answer

Warning: [W108] The rule-based lemmatizer did not find POS annotation for the token 'This'

What this message is about? How do I remove this warning message? import scispacy import spacy import en_core_sci_lg from spacy_langdetect import LanguageDetector from spacy.language import Language from spacy.tokens import Doc def…
Shiva Sharma
  • 145
  • 1
  • 2
  • 5
6
votes
1 answer

SpaCy 3 Transformer Vector Token Alignment

I'm using the SpaCy 3.0.1 together with the transformer model (en_core_web_trf). When I previously used SpaCy transformers it was possible to get the transformer vectors from a Token or Span. In SpaCy 3 however it seems like you can only access the…
MBT
  • 21,733
  • 19
  • 84
  • 102
5
votes
2 answers

Using spacy v3 which parameter should I change in the config file to resolve CUDA out of memory problem ? batch_size vs max_length vs batcher.size

Using spacy v3, I try to train a classifier using camemBert and got CUDA out of memory problem. To resolve this issue I read that I should decrease the batch size but I'm confused which parameter should I change between : [nlp]…
5
votes
2 answers

Update built-in NER model of Spacy instead of overwrite

I am using an inbuilt model of Spacy that is en_core_web_lg and want to train it using my custom entities. While doing that, I am facing two issues, It overwrite the new trained data with the old one and results in not recognizing the other…
sodmzs1
  • 324
  • 1
  • 12
5
votes
0 answers

How to train Spacy3 project with FP16 mixed precision

The goal is to run python -m spacy train with FP16 mixed precision to enable the use of large transformers (roberta-large, albert-large, etc.) in limited VRAM (RTX 2080ti 11 GB). The new Spacy3 project.yml approach to training directly uses…
Ronald Luc
  • 1,088
  • 7
  • 17
4
votes
0 answers

Convert/wrap Spacy model into a Tensorflow model

Is it possible to convert a saved spacy model into a TensorFlow model? Spacy provides a wrapper to work with TensorFlow models. Mentioned here. Can this be done the other way around? I need to convert a Spacy model into a TensorFlow model or wrap it…
4
votes
1 answer

Why spacy morphologizer doesn't work when we use a custom tokenizer?

I don't understand why when i'm doing this import spacy from copy import deepcopy nlp = spacy.load("fr_core_news_lg") class MyTokenizer: def __init__(self, tokenizer): self.tokenizer = deepcopy(tokenizer) def __call__(self, text): …
Vee
  • 297
  • 1
  • 7
4
votes
1 answer

Spacy v3 - ValueError: [E030] Sentence boundaries unset

I'm training an entity linker model with spacy 3, and am getting the following error when running spacy train: ValueError: [E030] Sentence boundaries unset. You can add the 'sentencizer' component to the pipeline with: nlp.add_pipe('sentencizer').…
Jon Flynn
  • 440
  • 6
  • 15
1
2 3
22 23