I know about BERT and other solution when you masking some words and try to predict them. But let say I have a text:
Transformer have taken the of Natural Processing by storm, transforming the field by leaps and bounds. New, bigger, and better models to crop up almost every , benchmarks in performance across a wide variety of tasks.
And I cannot in advance say to BERT where masking is. I am looking for algorithm which can understand where missing words are and after that predict them.