0

I'm querying Azure for "blobs" using FindBlobsByTagsAsync(), which returns something called Azure.AsyncPageable<Azure.Storage.Blobs.Models.TaggedBlobItem>

I need to run two separate queries (as there is no OR operator for this in the Azure blob query syntax), combine the results and return distinct results.

I've done this below without issue, but I'd like to run the two queries in parallel and then sort them after both are complete, and I'm struggling with this as the FindBlobsByTagsAsync() doesn't return a "Task" type?

These queries actually return really fast typically, but let's pretend they don't.

Thanks for any help!

Non-parallel code:

List<String> myBlobs = new List<String>();

await foreach (var taggedBlobItem in blobServiceClient.FindBlobsByTagsAsync(query1))
{ myBlobs.Add(taggedBlobItem.BlobName); }

await foreach (var taggedBlobItem in blobServiceClient.FindBlobsByTagsAsync(query2))
{ myBlobs.Add(taggedBlobItem.BlobName); }

return Ok(myBlobs.Distinct());
  • Are you looking to [merge two IAsyncEnumerables](https://stackoverflow.com/questions/66152698/how-to-query-two-iasyncenumerables-asynchronously) ? – YK1 Oct 06 '22 at 10:49

1 Answers1

0

One way is to wrap the 2 blob query calls in separate Task calls. Then you can wait on all the prepared Task objects at the same time, running the 2 queries in parallel:

var blobServiceClient = new BlobServiceClient("abc");
var myBlobs = new ConcurrentBag<string>();

var tagQuery1 = "some tag query 1";
var query1Task = Task.Factory.StartNew(async () =>
{
    await foreach (var taggedBlobItem in blobServiceClient.FindBlobsByTagsAsync(tagQuery1))
        myBlobs.Add(taggedBlobItem.BlobName);
});

var tagQuery2 = "some tag query 2";
var query2Task = Task.Factory.StartNew(async () =>
{
    await foreach (var taggedBlobItem in blobServiceClient.FindBlobsByTagsAsync(tagQuery2))
        myBlobs.Add(taggedBlobItem.BlobName);
});

await Task.WhenAll(query1Task, query2Task).ConfigureAwait(false);

var distinctBlobNames = myBlobs.ToList().Distinct();

Side Note: A thread-safe list object is needed since 2 separate threads are going to be updating it at the same time. I use the ConcurrentBag type which satisfies this concern.

ajawad987
  • 4,439
  • 2
  • 28
  • 45