I am getting this error and it is preventing my code to run. I try to filter the warning but even so it stops the running of my code. I still after many hours does not figure it out hiw to overcome it.
Là où les vêtements de sport connectés actuels sont axés sur la performance des sportifs, ici, on aura l'occasion pour des amateurs de se rassurer que les mouvements que nous effectuons sont justes. Cela nous évitera bien des mauvaises surprises (douleurs et autres...) au lendemain d'une activité.
Traceback (most recent call last):
File "/gpfs7kw/linkhome/rech/genlig01/umg16uw/test/expe_5/substitution/augment.py", line 93, in <module>
gen_eda(args.input, output, alpha_sr=alpha_sr, alpha_ri=alpha_ri, alpha_rs=alpha_rs, alpha_rd=alpha_rd, num_aug=num_aug)
File "/gpfs7kw/linkhome/rech/genlig01/umg16uw/test/expe_5/substitution/augment.py", line 80, in gen_eda
aug_sentences = eda(sentence, alpha_sr=alpha_sr, alpha_ri=alpha_ri, alpha_rs=alpha_rs, p_rd=alpha_rd, num_aug=num_aug)
File "/gpfs7kw/linkhome/rech/genlig01/umg16uw/test/expe_5/substitution/substitution.py", line 229, in eda
words = tokenizer(sentence)
File "/gpfs7kw/linkhome/rech/genlig01/umg16uw/test/expe_5/substitution/substitution.py", line 60, in tokenizer
sent_doc = nlp(sentence)
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/spacy/language.py", line 998, in __call__
doc = self.make_doc(text)
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/spacy/language.py", line 1081, in make_doc
return self.tokenizer(text)
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/spacy_stanza/tokenizer.py", line 83, in __call__
snlp_doc = self.snlp(text)
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/stanza/pipeline/core.py", line 231, in __call__
doc = self.process(doc)
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/stanza/pipeline/core.py", line 225, in process
doc = process(doc)
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/stanza/pipeline/mwt_processor.py", line 33, in process
preds += self.trainer.predict(b)
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/stanza/models/mwt/trainer.py", line 79, in predict
preds, _ = self.model.predict(src, src_mask, self.args['beam_size'])
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/stanza/models/common/seq2seq_model.py", line 296, in predict
is_done = beam[b].advance(log_probs.data[b])
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/stanza/models/common/beam.py", line 86, in advance
prevK = bestScoresId // numWords
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/torch/_tensor.py", line 29, in wrapped
return f(*args, **kwargs)
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/torch/_tensor.py", line 575, in __floordiv__
return torch.floor_divide(self, other)
UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /opt/conda/conda-bld/pytorch_1623448238472/work/aten/src/ATen/native/BinaryOps.cpp:467.)
Exception ignored in: <_io.FileIO name='Test_dolo_augmented.txt' mode='wb' closefd=True>
ResourceWarning: unclosed file <_io.TextIOWrapper name='Test_dolo_augmented.txt' mode='w' encoding='utf-8'>
this the library I import :
# -*- coding: UTF-8 -*-
# !/usr/bin/env python3
import random, pickle, os, csv
import re, string
import string
#import stanza
import spacy_stanza
import warnings
warnings.filterwarnings("error")
from random import shuffle
# stanza.download('fr')
nlp = spacy_stanza.load_pipeline('fr', processors='tokenize,mwt,pos,lemma')
random.seed(1)
def tokenizer(sentence):
sent_doc = nlp(sentence)
wds = [token.text for token in sent_doc if token.pos_ != 'SPACE']
return wds
def lemmatizer(token):
tok = [token.lemma_ for token in nlp(token)]
tok_lemme = tok[0]
#print(tok_lemme)
return tok_lemme
test = "Là où les vêtements de sport connectés actuels sont axés sur la performance des sportifs, ici, on aura l'occasion pour des amateurs de se rassurer que les mouvements que nous effectuons sont justes. Cela nous évitera bien des mauvaises surprises (douleurs et autres...) au lendemain d'une activité."
tokenizer(test)
Seems the problem is linked to stanza but I do not know why, I used pip to install it shoud I unsinstall it ?