Major storage types are block storage, object storage and file storage.
Block Storage
- Examples are hard disk drives and SSD's where the operating systems, databases can run. The OS can access the disk block by block with this storage.
- Equivalent cloud managed offerings are Azure disks, AWS EBS.
Azure Blob Storage
- It's an object storage with no hierarchy. Comparable to AWS s3.
- An object is a bundle of file along with a identifier and metadata.
- There is no concept of folders.
- Usually, its accessed using HTTPS and costs very low compared to other types of storage.
- Best used to store all types of structured, semi structured and unstructured binary data for long duration.
Azure File Storage
- It's a file storage that can interface using NFS, SMB. Similar to windows shares available in office/workstation environments.
- Here there is concept of files and folders, but these are not performant.
- Object storage is now more widely used instead of file storage.
Azure Data Lake Storage Gen2 - Azure's managed distributed fileystem. It's a distributed file system for Hadoop compute and storage with compatible semantics to HDFS.
Blob storage vs ADLS described here - https://stackoverflow.com/a/76038745/6563567
Although they might seem quite the same but analytical workload like databricks can in some cases work far more efficiently with ADLS. Apart from this, its 100% HDFS compatible and provide Linux like ACLs on files and folders.
real benefit of ADLS is that it's very efficient to move files,
rename files, move folders, rename folders, etc. ADLS's efficient
directory manipulation is beneficial for analytics workloads like
databricks/spark which best operates on file systems.