1

I'm trying to use scrapy to make a web scraper but I'm running into many problems since it uses Python2. is it possible to run the 2to3 command on all the files in the tarball simultaneously? Would that cause unforseen errors? Is there an alternative web scraper framework which is more up to date, more functional that might be recommended in stead?

I say that because there doesn't seem to be much recent activity on forms on the problems inherent with running version 0.24 of scrapy, i.e. the fact that it's written in python 2.

If scrapy is the best choice, and porting is a bad idea, what's the best way to run this on my python3 oriented machine? a command to run it only with python 2 or something i can change in a config file or whatnot.

UPDATE

If you have such problems what you need to do is:

simply run the setup.py script with python2, i.e.,

python2 setup.py install

and you're good to go, after that it'll work.

^as indicated by @alecxe

smatthewenglish
  • 2,831
  • 4
  • 36
  • 72
  • Are you saying that you are making an attempt to port Scrapy to Python3? The problem is that `Scrapy` is based on `twisted` and the latter is not there yet. – alecxe Feb 08 '15 at 04:00
  • yeah, exactly. would that work? is there just a newer more reliable scraper i can use instead? – smatthewenglish Feb 08 '15 at 04:03

1 Answers1

1

The problem with porting Scrapy to Python 3 is that Scrapy is built-in on top of the twisted event-driven framework, which currently is not yet there.

There is no web-scraping framework as big and mature as Scrapy on Python 3. Though, pyspider looks promising, but it is a bit different, see:

Also, there are other libraries related to web-scraping and html-parsing that support Python 3:

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • so, how could i run scrapy on my machine? – smatthewenglish Feb 08 '15 at 04:30
  • @flavius_valens well, follow the [installation guide](http://scrapy.readthedocs.org/en/latest/intro/install.html) or am I missing something? Thanks. – alecxe Feb 08 '15 at 04:31
  • yeah i did that, but it's giving me all kinds of problems, related to python 3 things, is there a command to run it only with python 2 or something? – smatthewenglish Feb 08 '15 at 04:33
  • @flavius_valens you need to have Python 2.7 installed and install Scrapy into that Python2.7 environment. – alecxe Feb 08 '15 at 04:35