0

I am trying to figure out an efficient way of generating a list of a million or more blobs in a container. All I am concerned with is just the name of each blob from that list, the program will then go through the list and perform some actions for each of the blob. But I think I got that part down.

My current problem is that because there are so many blobs in the container I am looking in, my program consumes an exorbitant amount of RAM and eventually the page file grows in size.

My code is kind of similar to here: https://stackoverflow.com/a/49161513/6715951

var backupBlobClient = backupStorageAccount.CreateCloudBlobClient();
var backupContainer = backupBlobClient.GetContainerReference("CONTAINER");

var blobs = backupContainer.ListBlobs().OfType<CloudBlockBlob>().ToList();

I did a test on a small scale with only a few blobs but scaling up doesn't work so well in my case. I am trying to figure out if there is a way where I can for example, limit the amount of files I am making a list for since I don't have to necessary make the whole list but can make a partial list, have the program work on those files, then generate another list starting where it had left off.

Not sure what would be the best way to approach that though since the current function I'm using to generate the list doesn't seem to have a way to limit the number in the first place.

I'm also using WindowsAzure.Storage v9.3.3. Not sure if upgrading to a newer version of the Azure storage libraries will mean a huge improvement in performance if that's what I should also do for this case.

Thanks in advance.

Rocketboy235
  • 67
  • 1
  • 6
  • 1
    You should look at answer provided by Jonathan Allen here: https://stackoverflow.com/a/49161513/6715951. – Gaurav Mantri Mar 07 '21 at 06:17
  • Thanks. Apparently I didn't look long/hard enough. Also realized someone asked a similar question (though less blobs) here: https://stackoverflow.com/questions/42647954/getting-only-all-names-of-azure-blob-files-in-container My program did end up running though it found millions upon millions of blobs at the expense of eating most of my RAM. The code I used apparently grabbed all of the information for each blob when I am only concerned about the name so that's one thing I definitely need to fix – Rocketboy235 Mar 07 '21 at 16:11

0 Answers0