0

I know about BERT and other solution when you masking some words and try to predict them. But let say I have a text:

Transformer have taken the of Natural Processing by storm, transforming the field by leaps and bounds. New, bigger, and better models to crop up almost every , benchmarks in performance across a wide variety of tasks.

And I cannot in advance say to BERT where masking is. I am looking for algorithm which can understand where missing words are and after that predict them.

Anderson Green
  • 30,230
  • 67
  • 195
  • 328
illuminato
  • 1,057
  • 1
  • 11
  • 33

1 Answers1

1

What you can do is check for each position in text (I'd recommend starting from position 2) compare if the next word present in text is among the most probable next words according to the model, like so:

"Transformer have taken the of Natural Processing by storm [...]"

  1. First iteration:

Input: "Transformer MASK"

Compare: MASK / "have"

  1. Second iteration:

Input: "Transformer have taken MASK"

Compare: MASK / "the"

  1. Third iteration:

Input: "Transformer have taken the MASK"

Compare: MASK / "of" - Here you probably would have a very low probability. This could help you check whether this could be the place for a missing word.

This post can help you achieve it programatically: Predicting Missing Words in a sentence - Natural Language Processing Model

Tiago Duque
  • 1,956
  • 1
  • 12
  • 31