Currently I am reading text from excel file and applying bigram to it. finalList has list used in below sample code has the list of input words read from input excel file.
Removed the stopwords from input with help of following library:
from nltk.corpus import stopwords
bigram logic applied on list of input text of words
bigram=ngrams(finalList ,2)
input text: I completed my end-to-end process.
Current output: Completed end, end end, end process.
Desired output: completed end-to-end, end-to-end process.
That means some group of words like (end-to-end) should be considered as 1 word.