I need help to do this exact thing with a String in Java. The best way to explain for me is by using a example.
So, I want to extract skip bi-grams from two sentences (user's input) and then be able to compare each others in terms of resemblance.
Sentence #1 : "I love green apples." Sentence #2 : "I love red apples."
Also, there is a variable named "distance" that is used to get the distance between words. (It is not very important at the moment)
Results
The skip bi-grams extracted from Sentence #1 using a distance of 3 would be :
{I love}, {I green}, {I apples}, {love green}, {love apples}, {green apples}
(Total of 6 bi-grams)
The skip bi-grams extracted from Sentence #2 using a distance of 3 would be :
{I love}, {I red}, {I apples}, {love red}, {love apples}, {red apples}
(Total of 6 bi-grams)
So far I have thought using String[] to put split String sentences.
So my question is, what could be the code that would extract those bi-grams from sentences ?
Thanks in advance!