Highest Voted 'scrapy-shell' Questions

27

votes

3 answers

Scrapy Shell and Scrapy Splash

We've been using scrapy-splash middleware to pass the scraped HTML source through the Splash javascript engine running inside a docker container. If we want to use Splash in the spider, we configure several required project settings and yield a…

asked Feb 11 '16 at 23:56

alecxe

462,703
120
1,088
1,195

20

votes

1 answer

Set headers for scrapy shell request

I know that you can scrapy shell -s USER_AGENT='custom user agent' 'http://www.example.com' to change the USER_AGENT, but how do you add request headers?

scrapy scrapy-shell

asked May 03 '16 at 17:23

Computer's Guy

5,122
8
54
74

11

votes

3 answers

Why am I getting this error in scrapy - python3.7 invalid syntax

I've had a heck of a time installing scrapy. I have it installed on my mac but I am getting this error when running the tutorial: Virtualenvs/scrapy_env/lib/python3.7/site-packages/twisted/conch/manhole.py", line 154 def write(self, data,…

python python-3.x macos scrapy-shell

asked Feb 19 '18 at 07:25

user3408397

523
1
5
14

11

votes

2 answers

How to disable robots.txt when you launch scrapy shell?

I use Scrapy shell without problems with several websites, but I find problems when the robots (robots.txt) does not allow access to a site. How can I disable robots detection by Scrapy (ignored the existence)? Thank you in advance. I'm not talking…

python scrapy web-crawler robots.txt scrapy-shell

asked Nov 26 '16 at 21:49

DARDAR SAAD

392
1
3
17

10

votes

2 answers

How can use scrapy shell with url and basic auth credentials?

I want to use scrapy shell and test response data for url which requires basic auth credentials. I tried to check scrapy shell documentation but I couldn't find it there. I tried with scrapy shell 'http://user:pwd@abc.com' but it didn't work. Does…

python-2.7 scrapy web-crawler basic-authentication scrapy-shell

asked Mar 16 '17 at 02:26

Rohanil

1,717
5
22
47

9

votes

3 answers

Scrapy shell against a local file

Before Scrapy 1.0, I could've run the Scrapy Shell against a local file quite simply: $ scrapy shell index.html After upgrading to 1.0.3, it started to throw an error: $ scrapy shell index.html 2015-10-12 15:32:59 [scrapy] INFO: Scrapy 1.0.3…

python shell web-scraping scrapy scrapy-shell

asked Oct 12 '15 at 19:36

alecxe

462,703
120
1,088
1,195

6

votes

1 answer

Scrapy shell return without response

I have a little problem with scrapy to crawl a website. I followed the tutorial of scrapy to learn how crawl a website and I was interested to test it on the site 'https://www.leboncoin.fr' but the spider doesn't work. So, I tried : scrapy shell…

python python-3.x attributeerror scrapy-shell

asked May 15 '17 at 07:41

Chris PERE

722
7
13

5

votes

1 answer

Scrapy ImagesPipeline WARNING: File (unknown-error): Error downloading image from

I am learning Python and Scrapy and I am learning how to download images using it. I am kind of stuck right now and I cant figure out what the real problem is. I am getting this error message when I run the spider : Unsupported URL scheme '':…

python scrapy scrapy-shell

asked Mar 21 '15 at 03:27

user1404801

4

votes

1 answer

Scrapy - 301 redirect in shell

I can not find a solution to the following problem. I am using Scrapy (latest version) and am trying to debug a spider. Using scrapy shell https://jigsaw.w3.org/HTTP/300/301.html -> it does not follow the redirect ( it is using a default spider to…

python web-scraping scrapy scrapy-shell

asked Jul 31 '16 at 11:14

Pixelartist

378
5
17

3

votes

1 answer

How can I use scrapy middleware in the scrapy Shell?

In a scrapy project one uses middleware quite often. Is there a generic way of enableing usage of middleware in the scrapy shell during interactive sessions as well?

scrapy middleware scrapy-shell

asked Jul 10 '22 at 10:41

thinwybk

4,193
2
40
76

3

votes

2 answers

Scrapy: why I can't extract my targeted data from weather underground?

I am new to Python and web scraping and this is my first ever question on stackoverflow. I watched several tutorials and then I tried to extract data from the table on this page: https://www.wunderground.com/hourly/ir/tehran/date/2021-04-14. The…

python web-scraping scrapy scrapy-shell

asked Apr 13 '21 at 13:22

Neil

49
6

3

votes

1 answer

scrapy downloads the html page but could get data using xpaths or css

I am trying scrape this page, when I do scrapy shell "https://redsea.com/en/apple-iphone-x-64gb-silver.html", it downloads the html page and I can view the downloaded html with view(response) in the browser: But when I try to get any data -product…

scrapy scrapy-shell

asked Nov 07 '17 at 17:01

Javed

5,904
4
46
71

3

votes

1 answer

Scrapy Error: 'NotSupported: Unsupported URL scheme '': no handler available for that scheme'

I am trying to scrap a site but while running the script, I'm getting following error 'NotSupported: Unsupported URL scheme '': no handler available for that scheme' If the rule is not wrong, why does it occur and what's your suggestion, please…

web-scraping scrapy scrapy-shell

asked Apr 03 '17 at 20:38

Samsul Islam

2,581
2
17
23

3

votes

1 answer

python convert chinese characters in url

I have a url like href="../job/jobarea.asp?C_jobtype=經營管理主管&peoplenumber=151", this is shown in inspect element. But when opened in new tab it is showing as ../job/jobarea.asp?C_jobtype=%B8g%C0%E7%BA%DE%B2z%A5D%BA%DE&peoplenumber=151 How do I…

python scrapy scrapy-shell

asked Apr 07 '15 at 07:40

Dev Pandu

121
2
12

2

votes

0 answers

I cant open scrapy shell at anaconda shell

I was trying to start the scrapy shell on anaconda But this Error occured multiple times; [scrapy.core.downloader.handlers] ERROR: Loading "scrapy.core.downloader.handlers.http.HTTPDownloadHandler" for scheme "http" Traceback (most recent call…

python scrapy scrapy-shell

asked Nov 02 '22 at 21:24

DiedierSteeerckx

31
1

Questions tagged [scrapy-shell]