0

I want to open a big file from the web, and would like to start processing it before it downloads entirely. What I would want is something like urllib2 but one that has another thread that downloads the file in the background. It will have a same interface as a file : if I do read for more bytes that are downloaded the main thread would block on it, and if the bytes are already there the read will return immediately. When everything is downloaded the additional downloader thread will die. When everything is read EOF will fire.

Is there some builtin module that does this?

mshell_lauren
  • 5,171
  • 4
  • 28
  • 36
koyaya
  • 111
  • 5
  • 1
    Please construct the question correctly, with a better punctuation, sentence etc. This is unreadable ! – tito Dec 13 '11 at 19:01

1 Answers1

0

I looked into this: http://mail.python.org/pipermail/python-bugs-list/2007-April/038250.html and this: https://stackoverflow.com/a/1517728/498782

and came up with a buffered reader for you:

url_resource = urllib2.urlopen(url)
CHUNK = 8 * 1024
while True:
    chunk_data = url_resource.read(CHUNK)
    if not chunk_data:
        break
    process(chunk_data)

But keep in mind the above contains blocking calls. For async work you can look into this:

http://docs.python.org/library/asyncore.html

Community
  • 1
  • 1
Saurav
  • 3,096
  • 3
  • 19
  • 12