1

is there standard method that able to lazy fetch next chunk of data and yield it by elements

currently I'm fetching all of chunks and chain them with itertools

def list_blobs(container_name:str, prefix:str):    
    chunks = []
    next_marker=None
    while True:
        blobs = blob_service.list_blobs(container_name, prefix=prefix, num_results=100, marker=next_marker)
        next_marker = blobs.next_marker
        chunks.append(blobs)
        if not next_marker:
            break

    return itertools.chain.from_iterable(chunks)

what is the "lazy" version of list_blobs fetcher?

Mazdak
  • 105,000
  • 18
  • 159
  • 188
Andrew Matiuk
  • 924
  • 1
  • 7
  • 21
  • This question would be much easier to answer with a [MCVE]. We can't run the code you posted or verify our solutions after modifying it. Since the solution should presumably work for any iterable, why not post a sample that runs without further context such as global variables? – timgeb Jan 02 '19 at 11:57

3 Answers3

1

You can just use yield from :

def list_blobs(container_name:str, prefix:str):
    next_marker = True
    while next_marker:
        blobs = blob_service.list_blobs(container_name, prefix=prefix, num_results=100, marker=next_marker)
        next_marker = blobs.next_marker
        yield from blobs
Mazdak
  • 105,000
  • 18
  • 159
  • 188
1

Replace chunks.append(blobs) with yield from blobs, and get rid of the return and chunks list entirely:

def generate_blobs(container_name:str, prefix:str):
    next_marker = None
    while True:
        blobs = blob_service.list_blobs(container_name, prefix=prefix, num_results=100, marker=next_marker)
        next_marker = blobs.next_marker
        yield from blobs
        if not next_marker:
            break

That converts the function to a generator function that yields a single item at a time.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
0

@ShadowRanger, @Kasrâmvd thank you very much

@timgeb, full code of lazy itarable through Azure BlobStorage

from azure.storage.blob import BlockBlobService
from azure.storage.blob import Blob
from typing import Iterable, Tuple

def blob_iterator(account:str, account_key:str, bucket:str, prefix:str)-> Iterable[Tuple[str, str]]:
    blob_service = BlockBlobService(account_name=account, account_key=account_key) 

    def list_blobs(bucket:str, prefix:str)->Blob:
        next_marker = None
        while True:
            blobs = blob_service.list_blobs(bucket, prefix=prefix, num_results=100, marker=next_marker)
            yield from blobs
            next_marker = blobs.next_marker
            if not next_marker:
                break

    def get_text(bucket:str, name:str)->str:
        return blob_service.get_blob_to_text(bucket, name).content

    return ( (blob.name, get_text(bucket, blob.name)) for blob in list_blobs(bucket, prefix) )


it = blob_iterator('account', 'account_key', 'container_name', prefix='AA')
Andrew Matiuk
  • 924
  • 1
  • 7
  • 21