1

I'm trying to train an NER model with SpaCy- v3, and there's this error I'm facing in the Example.from_dict() method. In fact, I had referred answers from this earlier question on how to use the Example class.

Here is the code snippet:

nlp = spacy.blank('en')
if 'ner' not in nlp.pipe_names:
    ner = nlp.create_pipe('ner')
    nlp.add_pipe('ner', last=True)
else:
    ner = nlp.get_pipe('ner')

for _, annotations in TRAIN_DATA:
    for label in annotations['entities']:
        ner.add_label(label[2])

other_pipes = [pipe for pipe in nlp.pipe_names if pipe != 'ner']
with nlp.disable_pipes(*other_pipes):
    optimizer = nlp.begin_training()
    for epoch in range(EPOCHS):
        random.shuffle(TRAIN_DATA)
        losses = {}
        print(f'Epoch {epoch+1} of {EPOCHS}:')
        for text, annotations in TRAIN_DATA:
            doc = nlp.make_doc(text)
            example = Example.from_dict(doc, annotations)
            nlp.update([example], drop=0.2, sgd=optimizer, losses=losses) #SGD
        print(losses) #Print losses after each epoch

Above, TRAIN_DATA is a list with tuples like this:

('El Salvador achieved independence from Spain in 1821 and from the Central American Federation in 1839 .',
 {'entities': [(97, 101, 'tim'),
   (39, 44, 'org'),
   (48, 52, 'tim'),
   (66, 93, 'org'),
   (0, 11, 'geo')]})

And finally, this is the error traceback:

TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_34/168282795.py in <module>
      8         for text, annotations in TRAIN_DATA:
      9             doc = nlp.make_doc(text)
---> 10             example = Example.from_dict(doc, annotations)
     11             nlp.update([example], drop=0.2, sgd=optimizer, losses=losses) #SGD
     12         print(losses) #Print losses after each epoch

/opt/conda/lib/python3.7/site-packages/spacy/training/example.pyx in spacy.training.example.Example.from_dict()

/opt/conda/lib/python3.7/site-packages/spacy/training/example.pyx in spacy.training.example.annotations_to_doc()

/opt/conda/lib/python3.7/site-packages/spacy/training/example.pyx in spacy.training.example._add_entities_to_doc()

/opt/conda/lib/python3.7/site-packages/spacy/training/iob_utils.py in offsets_to_biluo_tags(doc, entities, missing)
    102                     biluo[starts[s]] = "O"
    103         else:
--> 104             for token_index in range(start_char, end_char):
    105                 if token_index in tokens_in_ents.keys():
    106                     raise ValueError(

TypeError: 'numpy.float64' object cannot be interpreted as an integer

My first question on Stack Overflow. Hope I've provided all necessary info to be eligible for help. Thanks in advance!

0 Answers0