1

I am trying to create a procedure with Python 2.7 which retrieves the betting odds from different betting websites (such as betfair, ladbrokes etc) for statistical analysis. I am fairly new to python (i struggle with all the I.T Jargon) but i have done some research and have come up with the following structure.

from urllib import urlopen
import re

response = urlopen('http://beta.betfair.com/football/event?id=26821411')
html = response.read()

jay = re.compile(b'.*id="m57290-sel1_105142518-58805-0-back"><span class="price">(.*)</span></button>')

jay2 = re.findall(jay,html)

print(jay2)

This was supposed to go to the betfair website pull certain odds and print it, but i get nothing!

I have also tried to incorporate Beautifulsoup but my mac does not seem to be installing it properly or something. I keep getting

"ImportError: No module named beautifulsoup"

when i try to import BeautifulSoup from BeautifulSoup. I have tried installing using easyinstall and i have run the setup.py script also.
Similar scenario for scrapy.
I have done some further research and Java/javascript comes up quite frequently...
Can someone please help?

thanks in advance

LU RD
  • 34,438
  • 5
  • 88
  • 296
Pyfon
  • 11
  • 1
  • 2
  • Your inclination is correct. You need something more suited to the task than regexp. BeautifulSoup is -- or at least has been in the past -- solely Python code. That means that, to install it, you put the file anywhere on your Python path. – mechanical_meat Mar 19 '12 at 18:12
  • Did you make any attempt to figure out where things are going wrong? – Karl Knechtel Mar 19 '12 at 18:26
  • @bernie thanks.This is probably a really silly question but given i have literally just read sections from one or two python programming books a lot of this is new to me.How do you manually "put the file anywhere on your Python path" Karl Knechtel thanks for taking the time to help out. its most probably to do with this line "jay = re.compile(b'.*id="m57290-sel1_105142518-58805-0-back">(.*)')" because I am able to scrape other sections of the same website with the same code (ie titles,headlines etc).Skizz's answer at the bottom seems to support this... – Pyfon Mar 22 '12 at 03:26

3 Answers3

1

I've found that when I have multiple versions of Python on my Mac, it is tricky to target which version I want the module to be installed under. I get around it by using virtualenv, and then installing exactly the modules I need one-by-one using pip. Here's an introduction to virtualenv: http://simononsoftware.com/virtualenv-tutorial/

Basically, once you have virtualenv installed, you can create a stand-alone python environment that is isolated from everything else. The process goes like this in a terminal window:

Create a virtual Python environment

$ virtualenv --python=python2.7 env

Activate it (so it's now the default "python" in your PATH)

$ source env/bin/activate

Install something (note that you don't need "sudo" for this, because this is a local python installation in whatever directory you're working in)

$ pip install scrapy

Once you're done with your virtual Python environment for the time being, deactivate like so:

$ deactivate
Brendan Wood
  • 6,220
  • 3
  • 30
  • 28
  • thanks....I've managed to install Virtualenv but when i try to create a virtual Python environment i get "-bash: $: command not found"... – Pyfon Mar 22 '12 at 03:10
  • The "$" indicates the terminal prompt, not something that you should type. So, for the first command, enter exactly the following without the quotes and then press return/enter: "virtualenv --python=python2.7 env" – Brendan Wood Mar 22 '12 at 12:39
  • i actually tried both, with and without the "$". without it i end up with "...File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1201, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory" which doesn't look right to me... – Pyfon Mar 22 '12 at 15:36
  • Yeah, that doesn't look right to me either. Do you have Xcode installed? This topic might help: http://stackoverflow.com/questions/5658504/when-running-virtualenv-1-6-on-mac-os-x-10-6-7-python-2-7-1 – Brendan Wood Mar 22 '12 at 18:55
0

Most of the Betting web sites (especially good ones) have decent xml services. I suggest you to parse betting odds XML, instead of parsing web site. This tutorial would be very useful xml parsing for beginners: http://docs.python.org/2/library/xml.etree.elementtree.html

Alkindus
  • 2,064
  • 2
  • 16
  • 16
0

The "back-cell" id changes each time the page is called, so your existing regex is always going to fail no matter what framework you use.

Skizz
  • 646
  • 5
  • 8
  • thanks for the input.... I'm assuming your referring to this part.. "id="m57290-sel1_105142518-58805-0-back". Any ideas as to how to get round the problem... – Pyfon Mar 22 '12 at 03:11