0

I'm learning about crawlers, and after a few basic ones I tried downloading the google scholar crawler master from github to see how it runs, after a few errors that I could fix, I ran into a ModuleNotFoundError: No module named 'proxy' error (middleware.py file, from proxy import PROXIES line is the issue).

This code has had a few problems containing solutions that are no longer supported/advised in python 3.x versions, including modules that have since been renamed/moved, but I was unable to find out if this is the case for this as well, would appreciate help.

Ryan Schaefer
  • 3,047
  • 1
  • 26
  • 46
Z. Black
  • 17
  • 1
  • 4

1 Answers1

0

Assuming you're talking about this https://github.com/geekan/google-scholar-crawler crawler:

I just tried to run it on Python 2.7 and had no problems with it. A brief look at misc module told me, that there is a possible problem with relative imports (some information about it may be found in this quesion Relative imports in Python 3).

So, the short answer is simply to use python 2.7 as it will allow to concentrate on understanding how scrapy crawlers work instead of understanding language version differences.

UPD: also make sure to remove all of the import pdb; pdb.set_trace() breakpoints in the code

MrLokans
  • 47
  • 2