I am trying to scrape data from this link https://www.flatstats.co.uk/racing-system-builder.php using scrapy.
I want to automate the ajax call using scrapy.
When I click "Full SP" Button (inspect in Firebug) the post parameter has the sql string which is "strange"
race|2|eq|Ordinary|0|~tRIDER_TYPE
What dialect is this?
My code :
import scrapy
import urllib
class FlatStat(scrapy.Spider):
name= "flatstat"
allowed_domains = ["flatstats.co.uk"]
start_urls = ["https://www.flatstats.co.uk/racing-system-builder.php"]
def parse(self, response):
query_lst = response.xpath('//table[@id="system"]//tr/td[last()]/text()').extract()
query_str = ' '.join(query_lst)
url = 'https://www.flatstats.co.uk/ajax/sb_report.php'
body_dict = {'a_e_max': '9.99',
'a_e_min': '0',
'arch_min': '0',
'exp_min': '0',
'report_type':'S',
# copied from the Post parameters by inspecting. Actually I tried everything.
'sqlFullString' : u'''Type%20(Rider)%7C%3D%7COrdinary%20(Exclude%20Amatr%2C%20App%2C%20Lady%20Races
)%7CAND%7Crace%7C2%7C0%7COrdinary%7C0%7C~tRIDER_TYPE%7C-t%7Ceq''',
#I tried copying this from the post parameters as well but no success.
#I also tried sql from the table //td text() which is "normal" sql but no success
'sqlString': query_str}
#here i tried everything FormRequest as well though there is no form.
return scrapy.Request(url, method="POST", body=urllib.urlencode(body_dict), callback=self.parse_page)
def parse_page(self, response):
with open("response.html", "w") as f:
f.write(response.body)
So questions are:
- What is this sql.
- Why isn't it returning me the required page. How can I run the right query?
- I tried Selenium as well to click the button and let it do the stuff it self but that is another unsuccessful story. :(