I am working on an NLP problem that in one part of it i need to count co-occurrence of words beside or in 2-step of each other. Imagine we select a word, i want to check neighboring 2 words forward and 2 words backward of it. For example, suppose we have this sentence: "depending on the weather, i will go to park if the weather is sunny tomorrow."
then we have a list: x=[',','i','to']
. For example, for word 'will'
i assume a window of size 4. so the selected words are: ',','i','go','to'
. Then i check to see if these words are in my list x.And i should do this for all of the words in the sentence except ords that are in list. This concept is very close to finding n-grams
and TrigramCollocationFinder
in nltk
but it is not the exact thing that i want. the below picture show what i really want.
Is there any library or function that i can use to solve my problem (from NLTK for Scikit?)? Can anyone give the idea of implementing it?