Questions tagged [zyte]
13 questions
2
votes
2 answers
Why isn't Puppeteer page.click waiting (maybe Browserless?)
Goal:
I have a page that I need to get html from after first clicking something on the page.
Issue:
The html that comes back is not waiting for that element click.
Here's one way that I've tried to do it.
await page.setViewport({width: 1400, height:…

dizzy
- 1,177
- 2
- 12
- 34
1
vote
1 answer
sending request through proxy. request library works, axios does not
I am trying to update some old code to get rid of the request package since it is no longer maintained. I attempted to replace a proxy request with axios, but it doesn't work (I just get a timeout). Am I missing an axios config somewhere? The…

Rilcon42
- 9,584
- 18
- 83
- 167
1
vote
2 answers
Requests fail with 504: Gateway Time-out when using scrapy-splash in docker compose with zyte
I'm trying to scrape one site which partially renders content using JS.
I went ahead and found this project: https://github.com/scrapinghub/sample-projects/tree/master/splash_smart_proxy_manager_example, which quite neatly explains how to set things…

Odif Yltsaeb
- 5,575
- 12
- 49
- 80
1
vote
0 answers
I get the error "ImportError: libtk8.6.so: cannot open shared object file: No such file or directory" while deploying my python app to Zyte
I searched this question on internet and the most of the solutions suggest me to install tkinter. Tkinter has been installed but the error still persists. Please someone guide me on this

Rija Shaheed
- 69
- 2
- 7
0
votes
1 answer
scrapy spider working locally but resulting in 403 error when running on Zyte
The spider is setup in a way where it reads the links to scrape and finally, makes a post request, and the data is parsed.
The spider is able to collect data locally, but when deployed to ZYTE it results in the error shown below..
```
…

FalloutATS21
- 53
- 7
0
votes
1 answer
I'm having issue while deploying scrapper to Zyte formerly (Scraping hub)
My spider has to read some data from input.csv file. It runs fine locally. But when I try to deploy it on Zyte by shub deploy it does not includes input.csv in build.
So when I try to run it on the server it produces following error.
Traceback (most…

Muhammad Ahmad
- 11
- 4
0
votes
1 answer
How to save Scrapy Broad Crawl Results?
Scrapy has a built-in way of persisting results in AWS S3 using the FEEDS setting.
but for a broad crawl over different domains this would create a single file, where the results from all domains are saved.
how could I save the results of each…

NightOwl
- 1,069
- 3
- 13
- 23
0
votes
1 answer
Why error with installing csv when its part of python core package in scrapinghub
I have 3 spiders defined.
All the related requirements are mentioned in requirements.txt
scrapy
pandas
pytest
requests
google-auth
functions-framework
shub
msgpack-python
Also, the scrapinghub.yml defined to use scrapy 2.5
project:…

Avirup Das
- 189
- 1
- 3
- 15
0
votes
1 answer
401 Client Error: Unauthorized for url: https://storage.scrapinghub.com/collections
When I run a spider in Scrapy Cloud Projects I get this error:
401 Client Error: Unauthorized for url: https://storage.scrapinghub.com/collections/569447/s/casti
Do you have any idea why?
Logs
Error Log
0
votes
2 answers
Scrapinghub scrapy: ModuleNotFoundError: No module named 'pandas'
I have tried deploying to Zyte via command line and GitHub but I have been stuck with the above error.
I have tried different versions of Scrapy version 1.5 to 2.5 but the error still persists.
I have also tried setting my Scrapinghub.yml to the…

chuky pedro
- 756
- 1
- 8
- 26
0
votes
1 answer
Scrapinghub/Zyte: Unhandled error in Deferred: No module named 'scrapy_user_agents'
I'm deploying my Scrapy spider via my local machine to Zyte Cloud (former ScrapingHub). This is successful. When I run the spider I get the output below.
I already checked here. The Zyte team is not very responsive on their own site it seems, but…

Adam
- 6,041
- 36
- 120
- 208
-1
votes
1 answer
Is it possible to create a proxy failover with Python Scrapy?
Is it possible to create a proxy failover within Scrapy, so that when one fails the other will take over scraping the rest of the requests? I would of thought that it would be done using the retry middleware, but I don't really have a clue how to…

webbie1985
- 1
- 2
-1
votes
1 answer
How can I add a new spider arg to my own template in Scrapy/Zyte
I am working on a paid proxy spider template and would like the ability to pass in a new argument on the command line for a Scrapy crawler. How can I do that?

Gregory Williams
- 453
- 5
- 18