I have a model trained to disk with a slow tokenizer:
from transformers import convert_slow_tokenizer
from transformers import BertTokenizer, BertForSequenceClassificationa
mybert = BertForSequenceClassification.from_pretrained(PATH,
local_files_only=True,
)
tokenizer = BertTokenizer.from_pretrained(PATH,
local_files_only=True,
use_fast=True)
I am able to use it to tokenize like so:
tokenized_example = tokenizer(
mytext,
max_length=100,
truncation="only_second",
return_overflowing_tokens=True,
stride=50
)
However, it is non-fast:
tokenized_example.is_fast
False
I try to convert it to fast one, which looks successful
tokenizer = convert_slow_tokenizer.convert_slow_tokenizer(tokenizer)
However, now running this gives me:
tokenized_example = tokenizer(
mytext,
max_length=100,
truncation="only_second",
return_overflowing_tokens=True,
stride=50
)
TypeError: 'tokenizers.Tokenizer' object is not callable
How can I convert this slow tokenizer to a fast one?
I have seen this answer and I have sentencepiece installed---this did not fix my issue.