Azure storage provide cloud storage that is highly available, secure, durable, scalable and redundant. It includes,
- Azure Blobs (objects)
- Azue Data Lake Storage Gen 2,
- Azure files
- Azure queue and
- Azue tables
Blobs - Scalable object storage in Azure
Blob storage is a massively scalable object storage for unstructured data such as images, videos, audio, documents etc. These objects can be stored in hot, cool or archived tiers depending on frequency of access.
- Hot storage: For frequent access
- Cool storage: Infrequent access
- Archive for rarely accessed data
- Premium tier (new) : Low latency accessed data
Storage account
All access to data objects in Azure Storage happens through a storage account. It contains all your Blobs, files, queues, tables and disks. There are few options like General purpose V1, V2 and Blob storage. Azure recommends V2 for most scenarios. You can always upgrade from V1 without any downtime.
All data in storage account is encrypted on server side using 256-bit AES encryption. Encryption does not affect Azure Storage performance. By default access is allowed only to the account owner. You can control this with Azure AD, shared key authentication etc.
You can define CORS urls in Azure for storage accounts.
Container
A container organizes a set of blobs, similar to a folder in a file system. All blobs reside within a container. A storage account can include an unlimited number of containers, and a container can store an unlimited number of blobs.
Blob
- Block blobs store text and binary data
- Append blobs are made up of blocks like block blobs, but are optimized for append operations. Append blobs are ideal for scenarios such as logging data from virtual machines
- Page blobs store random access files up to 8 TB in size. Page blobs store the VHD files that back VMs
There may be times where large data-sets and network constraints make uploading data to Blob storage over the wire unrealistic. You can use Azure Data Box Disk to request solid-state disks (SSDs) from Microsoft. You can then copy your data to those disks and ship them back to Microsoft to be uploaded into Blob storage. (Covered below)
To ensure your data is durable azure storage replicates multiple copies of data. You can select from following options for this,
- Locally redundant storage (LRS)
- Zone redundant
- Geo-redundant
- Read access Geo redundant
Getting started is very easy, as mentioned in Microsoft docs tutorial, Clone the git repo. Set the connection string of your storage account and run. It will do,
- Create Blob container under selected account
- Get a reference to blob container
- Set access to public
- Create a file in local environment
- Get a reference to block Blob
- Upload created file to block Blob
- List all block Blobs
- Download uploaded file
Best Practices
There are other alternative cloud storage providers as well. Like Amazon Simple Storage (s3). See a comparison of Azure Blob vs Amazon S3.
Little bit more on Amazon S3,
There are many many large companies which uses Amazon S3, one of them is Netflix (Why Netflix migrated to AWS). Netflix even have geographically redundancy of there servers. There are other popular clients like Airbnb which uses Amazon S3 for storage.
Azure Files
Offers file shares in cloud accessible via standard Server Message Block protocol. Azure files can be used to,
- Replace on-premise file severs
- Lift and Shift (strategy for moving an application or operation from one environment to another – without redesigning the app) applications
More reads,
Azure Data Lake
Data lake is a storage repository, usually in Hadoop, that holds a vast amount of raw data in its native format until it is needed (Why use a data lake). It is a central repository allows you to store all your structured/unstructured data at any scale. You can run analytics from dashboards, visualize for big data processing and machine learning for better decisions. (Best Practices, Azure Data Lake Storage Gen1 vs Azure Blob Storage, Azure vs AWS)
Azure Queue Storage
Provides asynchronous cloud messaging between app components. Single queue message can be up to 64KB in size and a queue can contain millions of messages.
Common usage include
- Creating a backlog of work to process asynchronously
- Passing messages from an Azure web role to an Azure worker role
Another alternative to queue storage is to use Service bus (Comparison) which is more enterprise. There is no direct alternative to Service bus in Amazon but Amazon got Simple Queue Service (SQS) for the same purpose. Google cloud got Cloud Pub/Sub.
Azure CosmosDB (Alternative to Table Storage)
Table storage is a NoSQL key-value store for development using datasets. It uses JSON to serialize data and can perform OData-based queries. But as an alternative Azure Cosmos DB can be used to achieved for same purpose. It offers throughput-optimized tables, global distribution and automatic secondary indexes. (Azure table storage vs CosmosDB Table API). Azure CosmosDB is a very secure database.
Alternative to CosmosDB in AWS is DynamoDB (See a comparison).
Azure Data Box
Is a on-premise offline physical box which can be used to transfer TBs of data between your offline premise to azure cloud more easily when busy networks aren't an option.
Data box accommodates both offline and online (Creates a link between your site and azure) scenarios.
Case Study
Cloud Storage options in AWS
AWS also has many cloud storage options. It has many options which is highlighted in the following article. Cloud Storage options in AWS.
Cloud Storage AWS vs Azure
It's natural for us to compare what other options are there compared to Azure. While going through the options available in both AWS and Azure. Here's a comparison of both the platforms.
Further reads,