For questions specific to spaCy version 3, an Industrial-Strength Natural Language Processing tool in Python. Use the more generic tag `spacy` for general questions about the spaCy.
Questions tagged [spacy-3]
334 questions
12
votes
3 answers
Cant load spacy en_core_web_trf
As the self guide says, I've installed it with (conda environment)
conda install -c conda-forge spacy
python -m spacy download en_core_web_trf
I have spacy-transformers already installed. But when I simply do:
import…

Omar
- 1,029
- 2
- 13
- 33
9
votes
1 answer
Difference between spacy v3 en_core_web_trf pipeline and en_core_web_lg pipeline
I am doing some performance tests with spacy version 3 for right sizing my instances in production. I am observing the following
Observation:
Model name
Time without NER
Time with NER
Comments
en_core_web_lg
4.89 seconds
21.9 seconds
NER…

ryk
- 133
- 2
- 9
7
votes
2 answers
SpaCy: Set entity information for a token which is included in more than one span
I am trying to use SpaCy for entity context recognition in the world of ontologies. I'm a novice at using SpaCy and just playing around for starters.
I am using the ENVO Ontology as my 'patterns' list for creating a dictionary for entity…

hrshd
- 941
- 2
- 13
- 19
7
votes
2 answers
Spacy 3 Confidence Score on Named-Entity recognition
I need to get a confidence score for the tags predicted by NER 'de_core_news_lg' model. There was a well known solution to the problem in the Spacy 2:
nlp = spacy.load('de_core_news_lg')
doc = nlp('ich möchte mit frau Mustermann in der Musterbank…

Keyvan Sadri
- 71
- 1
- 2
6
votes
1 answer
ValueError: [E143] Labels for component 'tagger' not initialized
I've been following this tutorial to create a custom NER. However, I keep getting this error:
ValueError: [E143] Labels for component 'tagger' not initialized. This can be fixed by calling add_label, or by providing a representative batch of…

Diana
- 363
- 2
- 8
6
votes
2 answers
How to get a description for each Spacy NER entity?
I am using Spacy NER model to extract from a text, some named entities relevant to my problem, such us DATE, TIME, GPE among others.
For example, I need to recognize the Time Zone in the following sentence:
"Australian Central Time"
With Spacy…

Emiliano Viotti
- 1,619
- 2
- 16
- 30
6
votes
3 answers
AttributeError: module 'spacy' has no attribute 'load'
import spacy
nlp = spacy.load('en_core_web_sm')
**Error:** Traceback (most recent call last):
File "C:\Users\PavanKumar\.spyder-py3\ExcelML.py", line 27, in
nlp = spacy.load('en_core_web_sm')
AttributeError: module 'spacy' has no…

Arvind
- 71
- 1
- 4
6
votes
1 answer
Warning: [W108] The rule-based lemmatizer did not find POS annotation for the token 'This'
What this message is about? How do I remove this warning message?
import scispacy
import spacy
import en_core_sci_lg
from spacy_langdetect import LanguageDetector
from spacy.language import Language
from spacy.tokens import Doc
def…

Shiva Sharma
- 145
- 1
- 2
- 5
6
votes
1 answer
SpaCy 3 Transformer Vector Token Alignment
I'm using the SpaCy 3.0.1 together with the transformer model (en_core_web_trf).
When I previously used SpaCy transformers it was possible to get the transformer vectors from a Token or Span.
In SpaCy 3 however it seems like you can only access the…

MBT
- 21,733
- 19
- 84
- 102
5
votes
2 answers
Using spacy v3 which parameter should I change in the config file to resolve CUDA out of memory problem ? batch_size vs max_length vs batcher.size
Using spacy v3, I try to train a classifier using camemBert and got CUDA out of memory problem.
To resolve this issue I read that I should decrease the batch size but I'm confused which parameter should I change between :
[nlp]…

Marien
- 117
- 5
5
votes
2 answers
Update built-in NER model of Spacy instead of overwrite
I am using an inbuilt model of Spacy that is en_core_web_lg and want to train it using my custom entities. While doing that, I am facing two issues,
It overwrite the new trained data with the old one and results in not recognizing the other…

sodmzs1
- 324
- 1
- 12
5
votes
0 answers
How to train Spacy3 project with FP16 mixed precision
The goal is to run python -m spacy train with FP16 mixed precision to enable the use of large transformers (roberta-large, albert-large, etc.) in limited VRAM (RTX 2080ti 11 GB).
The new Spacy3 project.yml approach to training directly uses…

Ronald Luc
- 1,088
- 7
- 17
4
votes
0 answers
Convert/wrap Spacy model into a Tensorflow model
Is it possible to convert a saved spacy model into a TensorFlow model? Spacy provides a wrapper to work with TensorFlow models. Mentioned here.
Can this be done the other way around? I need to convert a Spacy model into a TensorFlow model or wrap it…

Shashank Yadav
- 174
- 1
- 11
4
votes
1 answer
Why spacy morphologizer doesn't work when we use a custom tokenizer?
I don't understand why when i'm doing this
import spacy
from copy import deepcopy
nlp = spacy.load("fr_core_news_lg")
class MyTokenizer:
def __init__(self, tokenizer):
self.tokenizer = deepcopy(tokenizer)
def __call__(self, text):
…

Vee
- 297
- 1
- 7
4
votes
1 answer
Spacy v3 - ValueError: [E030] Sentence boundaries unset
I'm training an entity linker model with spacy 3, and am getting the following error when running spacy train:
ValueError: [E030] Sentence boundaries unset. You can add the 'sentencizer' component to the pipeline with: nlp.add_pipe('sentencizer').…

Jon Flynn
- 440
- 6
- 15