Does yahoo finance ban web scrapy or not?

Question

User-agent: *
Sitemap: https://finance.yahoo.com/sitemap_en-us_desktop_index.xml
Sitemap: https://finance.yahoo.com/sitemaps/finance-sitemap_index_US_en-US.xml.gz
Disallow: /r/
Disallow: /__rapidworker-1.2.js
Disallow: /__blank
Disallow: /_td_api
Disallow: /_remote

Does yahoo finance ban web scrapy or not?
What was disallowed by yahoo finance website?
What we can infer from yahoo's robots.txt file?

score 2 · Accepted Answer · answered Oct 25 '17 at 01:25

Nothing in the robots.txt file expressly prevents you from scraping Yahoo Finance, however Yahoo finance is governed by Yahoo's Terms of Service.

The most pertinent part of this document says basically that you should not do anything which would interfere with their services. Realistically, this means that if you are planning on scraping Yahoo Finance for data, you should do so responsibly (not many thousands of requests, as this will quickly get you banned).

That said, web scraping is generally inefficient (as you are reloading an entire HTML page just to collect data programmatically). I would look into using an API instead (like those discussed here), as this will be a) more reliable b) faster and c) definitely be legal.

score 1 · Answer 2 · edited May 03 '21 at 14:32

1

They don't disallow it but my scraper gets hundreds of companies every 30 seconds and ever since, their website has kept changing formats. Also I noticed something new, they actually in fact will block your router IP for a little bit by replacing some of the variables with N/A and misinforming your program, so they don't state that they disallow it but they definitely don't like you doing it. So all im saying is be sneaky.

edited May 03 '21 at 14:32

DisappointedByUnaccountableMod

6,656
4
18
22

answered Mar 07 '20 at 00:30

Dylan Isaac

11
1

1

How many requests per minute or per hour is ok? – Serhii Kushchenko Jan 06 '21 at 19:04

Does yahoo finance ban web scrapy or not?

2 Answers2