A portion of code that I have that will parse a web site does not work.
I can trace the problem to the .read
function of my urllib2.urlopen object.
page = urllib2.urlopen('http://magiccards.info/us/en.html')
data = page.read()
Until yesterday, this worked fine; but now the length of the data is always 69496 instead of 122989, however when I open smaller pages my code works fine.
I have tested this on Ubuntu, Linux Mint and windows 7. All have the same behaviour.
I'm assuming that something has changed on the web server; but the page is complete when I use a web browser. I have tried to diagnose the issue with wireshark but the page is received as complete.
Does anybody know why this may be happening or what I could try to determine the issue?