In the nltk there are BigramAssocMeasures TrigramAssocMeasures, QuadgramAssocMeasures,
But if I have 5gram or 6gram, are there 5gramAssocMeasures o 6gramAssocMeasures in nltk?
Can someone help?
In the nltk there are BigramAssocMeasures TrigramAssocMeasures, QuadgramAssocMeasures,
But if I have 5gram or 6gram, are there 5gramAssocMeasures o 6gramAssocMeasures in nltk?
Can someone help?
You have to create them yourself.
Have a look at the source code of the association
module.
You can find it under <nltk>/metrics/association.py
(<nltk>
stands for your NLTK path).
Start with
class QuingramAssocMeasures(NgramAssocMeasures):
"""
A collection of 5-gram association measures.
...
"""
or whatever you like to call 5-grams.
Then you need to define the methods specific to the n-gram order, i.e. the ones that raise a NotImplementedError
in the abstract class:
._contingency()
and ._marginals()
.
You can peek at the implementations for 3- and 4-grams and build the methods by analogy. It's going to be a huge bulk of local variables though...