1

I've recently discovered this whole world of web scraping and am pretty new to it, but it amazes me.

After reading some related stuff, I decided to carry on and create my own project. I wanted something simple, so I tried making a .py that tells you if a given Instagram account has posted (or not) a story today.

Looking into the HTML code, I found that all accounts with active stories share the same attribute: aria-disabled="false".

So, all I had to do was to use bs4 and check that attribute. I made the following code:

res = requests.get('https://www.instagram.com/cristiano/')
res.raise_for_status()

soup = bs4.BeautifulSoup(res.text, 'html.parser')
aux = soup.select('div[aria-disabled="false"]')

print(aux)

Which should do the trick. However, raise_for_status throws the following error:

raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 429 Client Error: - for url: https://www.instagram.com/accounts/login/

Does anyone know what I am doing wrong? Thanks in advance :)

  • Does this answer your question? [How to avoid HTTP error 429 (Too Many Requests) python](https://stackoverflow.com/questions/22786068/how-to-avoid-http-error-429-too-many-requests-python) – dm2 Jun 30 '21 at 22:21
  • Not at all, or at least I'm not fully understanding it. They say it's caused due to too many requests to the site, and I haven't made more than 10 since I wrote the code (I don't even have the while loop yet). Thanks anyway :) – thegreenhoodie Jun 30 '21 at 22:25
  • Read the comments in the linked duplicate. Especially those regarding the User-Agent header. –  Jun 30 '21 at 22:37

0 Answers0