I have a matrix, and I am trying to generate text corpus.
chewbacca darth han leia luke obi
chewbacca 0 0 0 0 0.66 0.33
darth 0 0 0 1 0 0
han 0 0 0 0 1 0
leia 0 0 0 0 1 0
luke 0 0 0 0 0 0
obi 0 0 0 0 0 0
I selected the work chewbacca as my first word.
Now I am trying to find pairs for chewbacca, based on probabilities. Two words are here - luke(0.66) and obi (0.33).
The second word must be based on weighted probabilities.
For instance, if "luke" pairs with "chewbacca" as 0.66 and "obi" pairs with "chewbacca" as 0.33, "luke" must be selected twice more likely than "obi".
How to approach it? Appreciate any tips!