1

I am getting this error and it is preventing my code to run. I try to filter the warning but even so it stops the running of my code. I still after many hours does not figure it out hiw to overcome it.

Là où les vêtements de sport connectés actuels sont axés sur la performance des sportifs, ici, on aura l'occasion pour des amateurs de se rassurer que les mouvements que nous effectuons sont justes. Cela nous évitera bien des mauvaises surprises (douleurs et autres...) au lendemain d'une activité.
Traceback (most recent call last):
  File "/gpfs7kw/linkhome/rech/genlig01/umg16uw/test/expe_5/substitution/augment.py", line 93, in <module>
    gen_eda(args.input, output, alpha_sr=alpha_sr, alpha_ri=alpha_ri, alpha_rs=alpha_rs, alpha_rd=alpha_rd, num_aug=num_aug)
  File "/gpfs7kw/linkhome/rech/genlig01/umg16uw/test/expe_5/substitution/augment.py", line 80, in gen_eda
    aug_sentences = eda(sentence, alpha_sr=alpha_sr, alpha_ri=alpha_ri, alpha_rs=alpha_rs, p_rd=alpha_rd, num_aug=num_aug)
  File "/gpfs7kw/linkhome/rech/genlig01/umg16uw/test/expe_5/substitution/substitution.py", line 229, in eda
    words = tokenizer(sentence)
  File "/gpfs7kw/linkhome/rech/genlig01/umg16uw/test/expe_5/substitution/substitution.py", line 60, in tokenizer
    sent_doc = nlp(sentence)
  File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/spacy/language.py", line 998, in __call__
    doc = self.make_doc(text)
  File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/spacy/language.py", line 1081, in make_doc
    return self.tokenizer(text)
  File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/spacy_stanza/tokenizer.py", line 83, in __call__
    snlp_doc = self.snlp(text)
  File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/stanza/pipeline/core.py", line 231, in __call__
    doc = self.process(doc)
  File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/stanza/pipeline/core.py", line 225, in process
    doc = process(doc)
  File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/stanza/pipeline/mwt_processor.py", line 33, in process
    preds += self.trainer.predict(b)
  File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/stanza/models/mwt/trainer.py", line 79, in predict
    preds, _ = self.model.predict(src, src_mask, self.args['beam_size'])
  File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/stanza/models/common/seq2seq_model.py", line 296, in predict
    is_done = beam[b].advance(log_probs.data[b])
  File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/stanza/models/common/beam.py", line 86, in advance
    prevK = bestScoresId // numWords
  File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/torch/_tensor.py", line 29, in wrapped
    return f(*args, **kwargs)
  File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/torch/_tensor.py", line 575, in __floordiv__
    return torch.floor_divide(self, other)
UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448238472/work/aten/src/ATen/native/BinaryOps.cpp:467.)
Exception ignored in: <_io.FileIO name='Test_dolo_augmented.txt' mode='wb' closefd=True>
ResourceWarning: unclosed file <_io.TextIOWrapper name='Test_dolo_augmented.txt' mode='w' encoding='utf-8'>



this the library I import :


# -*- coding: UTF-8 -*-
# !/usr/bin/env python3

import random, pickle, os, csv
import re, string
import string
#import stanza
import spacy_stanza
import warnings
warnings.filterwarnings("error")
from random import shuffle

# stanza.download('fr')
nlp = spacy_stanza.load_pipeline('fr', processors='tokenize,mwt,pos,lemma')
random.seed(1)

def tokenizer(sentence):

    sent_doc = nlp(sentence)
    wds = [token.text for token in sent_doc if token.pos_ != 'SPACE']
    return wds
    
def lemmatizer(token):

    tok = [token.lemma_ for token in nlp(token)]
    tok_lemme = tok[0]
    #print(tok_lemme)
    
    return tok_lemme

test = "Là où les vêtements de sport connectés actuels sont axés sur la performance des sportifs, ici, on aura l'occasion pour des amateurs de se rassurer que les mouvements que nous effectuons sont justes. Cela nous évitera bien des mauvaises surprises (douleurs et autres...) au lendemain d'une activité."

tokenizer(test)


Seems the problem is linked to stanza but I do not know why, I used pip to install it shoud I unsinstall it ?

kely789456123
  • 605
  • 1
  • 6
  • 21
  • as for me you may have two different problems: (1) warning with `floor_divide` which shouldn't stop program, (2) some problem with file `test_augmented.txt` which gives error and stops program. So you try to resolve problem with `floor_divide` but it is not problem - you have to find why you get problem with `test_augmented.txt` – furas Jan 15 '22 at 04:11
  • I try ot decorticate the probem and I found that the function tokenizer which I defined is unable to tokenized long senteces. as the exemple show. I updated – kely789456123 Jan 15 '22 at 05:09
  • 1
    code works for me if I use `"ignore"` instead of `"error"` - `warnings.filterwarnings("ignore")` – furas Jan 15 '22 at 21:29

0 Answers0