I'm trying to build a short Python program that extracts Pewdiepie's number of subscribers which is updated every second on socialblade to show it in the terminal. I want this data like every 30 seconds.
I've tried using PyQt but it's slow, i've turned to dryscrape, slightly faster but doesn't work either as I want it to. I've just found Invader and written some short code that still has the same problem : the number returned is the one before the Javascript on the page is executed :
from invader import Invader
url = 'https://socialblade.com/youtube/user/pewdiepie/realtime'
invader = Invader(url, js=True)
subscribers = invader.take(['#rawCount', 'text'])
print(subscribers.text)
I know that this data is accessible via the site's API but it's not always working, sometimes it just redirect to this.
Is there a way to get this number after the Javascript on the page modified the counter and not before ? And which method seems the best to you ? Extract it :
- from the original page which always returns the same number for hours ?
- from the API's page which bugs when not using cookies in the code and after a certain amount of time ?
Thanks for your advices !