2

Currently I am analyzing a pretty huge amount of text. I would like to perform a Log-Likelihood Ratio on two wordlists in order to identify frequency deviations of terms in the foreground corpus vs the normative corpus. I have coded the Log-Likelihood in Python, but running the code on bigrams results in and a slow computer and a long waiting time. I just read I can index my corpora with PyLucene, which will speed up running tasks on the corpora. There is enough documentation how to index, but I read somewhere a Log-Likelihood Ratio function exists in PyLucene. Does anybody know anything about this function? Thanks in advance.

0 Answers0