-2

I'm trying to log in into SureTrader activeweb (a broker website for stock trading). then I want to fetch real-time stock data from the website. I have tried to do this using urllib, urllib2, mechanize, beautifulsoup and requests, but i can't find a way to do this. The website I want to log in to is https://activeweb.suretrader.com/, which then redirects to the members webiste ( I do have an account). I am a begginer and I have tried tutorials but I've had no luck. Here are a few of the things I've tried:

https://www.youtube.com/watch?v=Igvf5C7qwO0 How can I input data into a webpage to scrape the resulting output using Python?

and other but I can't post any more links :P.

I have a few python project but none works. Also important is that the tutorials work with other websites but not the one I want, maybe it's because it's https?

I am new to the forum, any help and/or recommendations will be accepted.

EDIT

I guessed the website had anti-scraping methods, just wanted to make sure. Also: I have an account, this is just for a little summer project I am working on, my intentions are not unethical.

Community
  • 1
  • 1
Cezzxar
  • 1
  • 2

2 Answers2

0

requests has support for Sessions, which means you can send a POST to the login page (which you can figure out by looking at your browser's Network tab on web dev tools) and retain the cookie sent by the server. More information on Sessions: http://docs.python-requests.org/en/latest/user/advanced/#session-objects

Since this is a stock data website, they may have some anti-scraping measures in places. You may need to change your request rate and user agent, for example. In that case your job is much harder and you are getting dangerously close to unethical behavior.

If the problem is somewhere else in the request (like it's always returning 400 for the scraper), I suggest you give more detailed information on it, along with a sample of the code you used.

BoppreH
  • 8,014
  • 4
  • 34
  • 71
0

Wow... someone needs to contact them and explain what 'usemin', 'uglify', 'concat', 'require' and other modern tools are.

The likelihood that you will scrape anything from that site is minimal. From what I can tell at first glance, the DOM is being heavily manipulated in Javascript. Since Beautifulsoup et al are not javascript interpreters, you will only get the underlying html - likely it is structure and not content. Which explains why it works on other sites.

Sina Khelil
  • 2,001
  • 1
  • 18
  • 27