1

Since recently I have been getting this error whenever I run my notebook:

ModuleNotFoundError: No module named 'pytextrank'

Here is the link to my notebook: https://colab.research.google.com/github/neomatrix369/awesome-ai-ml-dl/blob/master/examples/better-nlp/notebooks/jupyter/better_nlp_summarisers.ipynb#scrollTo=-dJrJ54a3w8S

Although checks show that the library is installed, python import fails - I have had this once in a different scenario and fixed it using:

python -m pip install pytextrank

But this does not have any impact, the error still persists.

This wasn't a problem in the past and the same notebook worked well - I think it might be a regression.

Any thoughts? Any useful feedback will be highly appreciated.

Here is the code that I invoke:

import pytextrank
import sys
import networkx as nx
import pylab as plt

And I get this in the colab cell:

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-3-7f30423e40f2> in <module>()
      3 sys.path.insert(0, './awesome-ai-ml-dl/examples/better-nlp/library')
      4 
----> 5 from org.neomatrix369.better_nlp import BetterNLP

1 frames
/content/awesome-ai-ml-dl/examples/better-nlp/library/org/neomatrix369/summariser_pytextrank.py in <module>()
----> 1 import pytextrank
      2 import sys
      3 import networkx as nx
      4 import pylab as plt
      5 

ModuleNotFoundError: No module named 'pytextrank'

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
louis_guitton
  • 5,105
  • 1
  • 31
  • 33
Mani Sarkar
  • 115
  • 2
  • 9
  • I managed to get around this issue, I had to change my shell script and rebuild the notebook but strangely I didn't have to do that on Jupyter Notebooks or Kaggle, the original kernel/notebook just worked out of the box without any changes but my new changes seem to work now on all three platforms: Jupyter, Colab and Kaggle - the last time I checked. – Mani Sarkar Oct 19 '19 at 18:02
  • Makes me think there was something on Colab that caused it to **NOT** work, and I have to use a fixed way of installing packages to get this to work, see the change I made https://github.com/neomatrix369/awesome-ai-ml-dl/commit/430abf9d520ed66b8ad9641593f152c90eddd61f#diff-6846f30138dec5443ea2312e21bc21a1, the latest version looks like this https://github.com/neomatrix369/awesome-ai-ml-dl/blob/master/examples/better-nlp/build/install-dependencies.sh. _I'll encourage the engineering team to reproduce the issue using my git history._ – Mani Sarkar Oct 19 '19 at 18:03

1 Answers1

2

You can run a cell to use pip to install pytextrank explicitly in Colab:

!pip install pytextrank

After that, assuming a data file mih.json has been created or uploaded, the following code runs in Colab:

import pytextrank

path_stage0 = "mih.json"
path_stage1 = "o1.json"

with open(path_stage1, 'w') as f:
    for graf in pytextrank.parse_doc(pytextrank.json_iter(path_stage0)):
        f.write("%s\n" % pytextrank.pretty_print(graf._asdict()))

graph, ranks = pytextrank.text_rank(path_stage1)

for rl in pytextrank.normalize_key_phrases(path_stage1, ranks):
    print(pytextrank.pretty_print(rl))

generating this output:

["systems", 0.11805817287949238, [2], "np", 1]
["mixed types", 0.09727027009440953, [31, 24], "np", 1]
["minimal set", 0.0656181165649375, [19, 5], "np", 1]
["considered", 0.05314007683878527, [15], "vbn", 2]
["strict inequations", 0.05291409382374228, [11, 12], "np", 1]
["natural numbers", 0.0502966356772243, [6, 7], "np", 1]
["types", 0.048635135047204764, [24], "nns", 3]
["be", 0.04857443543935028, [14], "vb", 3]
["set", 0.03280905828246875, [5], "nn", 4]
["minimal generating sets", 0.03280905828246875, [19, 23, 5], "np", 1]
["solving", 0.03256884170337274, [30], "vbg", 1]
["solutions", 0.030576007873278972, [20], "nns", 3]
["linear constraints", 0.0271278909527188, [3, 4], "np", 1]
["linear diophantine equations", 0.0271278909527188, [3, 9, 10], "np", 1]
["inequations", 0.02645704691187114, [12], "nns", 2]
["nonstrict inequations", 0.02645704691187114, [13, 12], "np", 1]
["numbers", 0.02514831783861215, [7], "nns", 1]
["used", 0.021117092168160108, [29], "vbn", 1]
["given", 0.020891260341666464, [25], "vbn", 1]
["supporting", 0.014650048747874869, [28], "vbg", 1]
["constraints", 0.0135639454763594, [4], "nns", 1]
["diophantine", 0.0135639454763594, [9], "nnp", 1]
["generating", 0.013543878573169788, [23], "nn", 1]
["algorithms", 0.013397910676526051, [21], "nns", 2]
["equations", 0.013056662983606804, [10], "nns", 1]
["constructing", 0.012594570053681782, [27], "vbg", 1]
["upper bounds", 0.012248038294705636, [16, 17], "np", 1]
["nonstrict", 0.011793629610984586, [13], "nn", 1]
["components", 0.0113294598001366, [18], "nns", 1]
["construction", 0.009991849727289892, [22], "nn", 1]
["compatibility", 0.006124019147352818, [1], "nn", 2]
["bounds", 0.006124019147352818, [17], "nns", 1]
["corresponding", 0.006124019147352818, [26], "vbg", 1]
["criteria", 0.004297554552892375, [8], "nns", 2]
Paco
  • 602
  • 1
  • 9
  • 19
  • 1
    What you are sharing works in all instances and thanks for that, but if you take an older version if my Google Colab notebook you will face the same issue as mine. The issue is specific to the instances on Google CoLab, I dont have this issue on Kaggle Kernels or my local machine! – Mani Sarkar Dec 06 '19 at 12:17