I want to generate generate n-grams from a sequence of tokens:
bigram:: "1 3 4 5" --> { (1,3), (3,4), (4,5) }
After searching I found this thread that used:
def find_ngrams(input_list, n):
return zip(*[input_list[i:] for i in range(n)])
If I use this piece of code during my training time I think it kills the performance. So I looking for a better option.