25

Say I am generating a couple of json files each day in my blob storage. What I want to do is to get the latest file modified in any of my directories. So I'd have something like this in my blob:

2016/01/02/test.json
2016/01/02/test2.json
2016/02/03/test.json

I want to get 2016/02/03/test.json. So one way is getting the full path of the file and do a regex checking to find the latest directory created, but this doesn't work if I have more than one josn file in each dir. Is there anything like File.GetLastWriteTime to get the latest modified file? I am using these codes to get all the files btw:

public static CloudBlobContainer GetBlobContainer(string accountName, string accountKey, string containerName)
{
    CloudStorageAccount storageAccount = new CloudStorageAccount(new StorageCredentials(accountName, accountKey), true);
    // blob client
    CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
    // container
    CloudBlobContainer blobContainer = blobClient.GetContainerReference(containerName);
    return blobContainer;
}

public static IEnumerable<IListBlobItem> GetBlobItems(CloudBlobContainer container)
{
    IEnumerable<IListBlobItem> items = container.ListBlobs(useFlatBlobListing: true);
    return items;
}

public static List<string> GetAllBlobFiles(IEnumerable<IListBlobItem> blobs)
{
    var listOfFileNames = new List<string>();

    foreach (var blob in blobs)
    {
        var blobFileName = blob.Uri.Segments.Last();
        listOfFileNames.Add(blobFileName);
    }
    return listOfFileNames;
}
juvchan
  • 6,113
  • 2
  • 22
  • 35
Yar
  • 7,020
  • 11
  • 49
  • 69
  • 1
    finally how you achieve this scenarios for multiple folders path with last modified property ? – Neo Jan 29 '18 at 17:16
  • All the current answers are out of date for the new V12 azure blob nuget package – rollsch May 26 '21 at 01:29

8 Answers8

33

Each IListBlobItem is going to be a CloudBlockBlob, a CloudPageBlob, or a CloudBlobDirectory.

After casting to block or page blob, or their shared base class CloudBlob (preferably by using the as keyword and checking for null), you can access the modified date via blockBlob.Properties.LastModified.

Note that your implementation will do an O(n) scan over all blobs in the container, which can take a while if there are hundreds of thousands of files. There's currently no way of doing a more efficient query of blob storage though, (unless you abuse the file naming and encode the date in such a way that newer dates alphabetically come first). Realistically if you need better query performance I'd recommend keeping a database table handy that represents all the file listings as rows, with things like an indexed DateModified column to search by and a column with the blob path for easy access to the file.

UPDATE (2022) It appears that Microsoft now offers customizable Blob Index Tags. This should allow for adding a custom DateModified property or similar on blob metadata, and performing efficient "greater than" / "less than" queries against your blobs without the need for a separate database. (NOTE: It apparently only supports string values, so for date values you would need to make sure to save them as a lexicographically-sortable format like "yyyy-MM-dd".)

Mike Asdf
  • 2,309
  • 26
  • 34
  • What about directories ? how to access their last modified time ? – Shmil The Cat Feb 26 '18 at 17:54
  • 1
    Directories? Do you mean containers? Or do you mean the artificial path delimiter construct that blob names can have? – Mike Asdf Feb 26 '18 at 18:06
  • 1
    The later one (the artificial path delimiter construct that blob names can have) – Shmil The Cat Feb 26 '18 at 20:44
  • 1
    So a "directory" is really a collection of blobs that share a certain string prefix, so you'd have to enumerate those blobs and aggregate the blob timestamps (by min, max, or whatever makes sense for your situation). Note that filtering blobs by prefix *is* supported by the API. – Mike Asdf Feb 26 '18 at 22:44
  • All the other answers don't mention casting - this was immensely helpful. – VSO Jan 24 '19 at 14:31
  • This is no longer up to date for azure V12 nuget package – rollsch May 26 '21 at 01:28
17

Like Yar said, you can use the LastModified property of an individual blob object. Here is a code snippet that shows how to do that, once you have a reference to the correct container:

var latestBlob = container.ListBlobs()
    .OfType<CloudBlockBlob>()
    .OrderByDescending(m => m.Properties.LastModified)
    .ToList()
    .First();

Note: The blob type may not be <CloudBlockBlob>. Be sure to change that if necessary.

hbd
  • 674
  • 7
  • 21
3
       //connection string
        string storageAccount_connectionString = "**NOTE: CONNECTION STRING**";

        // Retrieve storage account from connection string.
        CloudStorageAccount storageAccount = CloudStorageAccount.Parse(storageAccount_connectionString);

        // Create the blob client.
        CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();

        // Retrieve reference to a previously created container.
        CloudBlobContainer container = blobClient.GetContainerReference("**NOTE:NAME OF CONTAINER**");
        //The specified container does not exist

        try
        {
            //root directory
            CloudBlobDirectory dira = container.GetDirectoryReference(string.Empty);
            //true for all sub directories else false 
            var rootDirFolders = dira.ListBlobsSegmentedAsync(true, BlobListingDetails.Metadata, null, null, null, null).Result;

            foreach (var blob in rootDirFolders.Results)
            {
                if (blob is CloudBlockBlob blockBlob)

                {
                    var time = blockBlob.Properties.LastModified;
                    Console.WriteLine("Data", time);

                }
            }

        }
        catch (Exception e)
        {
            //  Block of code to handle errors
            Console.WriteLine("Error", e);

        }
ASHISH R
  • 4,043
  • 1
  • 20
  • 16
3

The previous answers are out of date for the new V12 Nuget package. I used the following guide to assist upgrading from version 9 to version 12 https://elcamino.cloud/articles/2020-03-30-azure-storage-blobs-net-sdk-v12-upgrade-guide-and-tips.html

The new nuget package is Azure.Storage.Blobs and I used version 12.8.4

The following code will get your last modified date. There would be an async version of this code you could write as well.

using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Blob;
using Azure.Storage.Blobs;
using Azure.Storage.Sas;
using Azure.Storage.Blobs.Specialized;

DateTimeOffset? GetLastModified()
{
    BlobServiceClient blobServiceClient = new BlobServiceClient("connectionstring")
    BlobContainerClient blobContainerClient = blobServiceClient.GetBlobContainerClient("blobname");
    BlobClient blobClient = blobContainerClient.GetBlobClient("file.txt");
    if (blobClient == null || !blobClient.Exists()) return null;
    DateTimeOffset lastModified = blobClient.GetProperties().Value.LastModified;
    return lastModified;
}
rollsch
  • 2,518
  • 4
  • 39
  • 65
  • 2
    OP question is to get the latest modified file, here you're just getting the last modified date of a specific file? not sure how it solves OP question – user1075613 Jun 02 '21 at 16:39
  • 1
    Loop over the blobs. I was mainly trying to show the new API. – rollsch Jun 03 '21 at 21:41
2

Use the Azure Web Jobs SDK. The SDK has options to monitor for new/updated BLOBs.

viperguynaz
  • 12,044
  • 4
  • 30
  • 41
  • i want to use the same for azure file share storage , how can i use azure web jobs sdk ? – Neo Aug 27 '18 at 12:23
1

In case of issue use blockBlob.Container.Properties.LastModified

Suraj Rao
  • 29,388
  • 11
  • 94
  • 103
Prashant N
  • 11
  • 1
0

With Microsoft.Azure.Storage.Blob you can get it as follow:

using System;
using System.Collections.Generic;
using System.IO;
using System.Threading.Tasks;
using Microsoft.Azure.Storage;
using Microsoft.Azure.Storage.Blob;

namespace ListLastModificationOnBlob
{
    class Program
    {
        static void Main(string[] args)
        {
            MainAsync().Wait();
        }

        static async Task MainAsync()
        {
            string storageAccount_connectionString = @"Your connection string";

            // Retrieve storage account from connection string.
            CloudStorageAccount storageAccount = CloudStorageAccount.Parse(storageAccount_connectionString);

            // Create the blob client.
            CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();

            var containers = await ListContainersAsync(blobClient);

            foreach (var container in containers)
            {
                Console.WriteLine(container.Name);

                try
                {
                    //root directory
                    CloudBlobDirectory dira = container.GetDirectoryReference(string.Empty);
                    //true for all sub directories else false 
                    var rootDirFolders = dira.ListBlobsSegmentedAsync(true, BlobListingDetails.Metadata, null, null, null, null).Result;

                    using (var w = new StreamWriter($"{container.Name}.csv"))
                    {
                        foreach (var blob in rootDirFolders.Results)
                        {
                            if (blob is CloudBlob blockBlob)
                            {
                                var time = blockBlob.Properties.LastModified;
                                var created = blockBlob.Properties.Created;

                                var line = $"{blockBlob.Name},{created},{time}";
                                await w.WriteLineAsync(line);
                                await w.FlushAsync();
                            }
                        }
                    }
                }
                catch (Exception e)
                {
                    //  Block of code to handle errors
                    Console.WriteLine("Error", e);

                }
            }
        }

        private static async Task<IEnumerable<CloudBlobContainer>> ListContainersAsync(CloudBlobClient cloudBlobClient)
        {
            BlobContinuationToken continuationToken = null;
            var containers = new List<CloudBlobContainer>();

            do
            {
                ContainerResultSegment response = await cloudBlobClient.ListContainersSegmentedAsync(continuationToken);
                continuationToken = response.ContinuationToken;
                containers.AddRange(response.Results);

            } while (continuationToken != null);

            return containers;
        }
    }
}

Above code for given Storage Account:

  • get all containers in Account
  • take all blob is container
  • save Created and LastModified with blob name in csv file (named like container)
Krzysztof Madej
  • 32,704
  • 10
  • 78
  • 107
0

Using rollsch and hbd's methods, I was able to produce the latest image like so

public string File;

public async Task OnGetAsync()
{
    var gettingLastModified = _blobServiceClient
        .GetBlobContainerClient("images")
        .GetBlobs()
        .OrderByDescending(m => m.Properties.LastModified)
        .First();

    LatestImageFromAzure = gettingLastModified.Name;

    File = await _blobService.GetBlob(LatestImageFromAzure, "images");
}

I was also using these methods https://www.youtube.com/watch?v=B_yDG35lb5I&t=1864s

Peter Hedberg
  • 3,487
  • 2
  • 28
  • 36
SeanMcP
  • 249
  • 1
  • 2
  • 6