You are right there is no verbose option, but there is a possible workaround.
The CountVectorizer
class, does some pre-processing steps like strip_accents
(None by default as in your case) and then lowercase
and tokenizer
(True by default as in your case for both, because you are using the word
analyzer with default parameters).
Considering you don't care about the progress of the pre-processing steps, the only way to get information about the progress of the method, would be to create your own analyser
as a callable, which is simply a function that would do the same but also print the progress at every 10% (as an example).
But if you just want a friendly way to know how long it will take, since the computation time of CountVectorizer is linearly correlated to the input size, you can choose just a fraction of your data (randomly sampled) like 10%, process that and time it. Then you know it will take about 10x more to process everything.