How to Upload Files to Azure Storage Blobs Using Python

Azure storage blobs offers a very cost effective and fast storage solution for unstructured data. It is an ideal solution if you want serve content such as images. It is also possible to speed up content delivery performance using Azure CDN service with it.

Before running the following programs, ensure that you have the pre-requisites ready. In the following sample python programs, I will be using the latest Python SDK v12 for Azure storage blob.

Install Python 3.6 or above. In Mac, use Homebrew to install python 3,

brew install python3

Install the Azure Blob storage client library for Python package,

pip3 install azure-storage-blob --user

Using Azure portal, create an Azure storage v2 account and a container before running the following programs. You will also need to copy the connection string for your storage account from the Azure portal. If you want public access to uploaded images, set the container public access level to "Blob (anonymous read access for blobs only)".

How to Upload Files to Azure Storage Blobs Using Python

The following program demonstrates a typical use case where you want to bulk upload a set of jpg images from a local folder to the Azure blob storage container. Note that for large number of files, this program may not be efficient as it sequentially uploads the images. Replace MY_CONNECTION_STRING, MY_IMAGE_CONTAINER and LOCAL_IMAGE_PATH before running the program.

# upload_blob_images.py
# Python program to bulk upload jpg image files as blobs to azure storage
# Uses latest python SDK() for Azure blob storage
# Requires python 3.6 or above
import os
from azure.storage.blob import BlobServiceClient, BlobClient
from azure.storage.blob import ContentSettings, ContainerClient

# IMPORTANT: Replace connection string with your storage account connection string
# Usually starts with DefaultEndpointsProtocol=https;...
MY_CONNECTION_STRING = "REPLACE_THIS"

# Replace with blob container. This should be already created in azure storage.
MY_IMAGE_CONTAINER = "myimages"

# Replace with the local folder which contains the image files for upload
LOCAL_IMAGE_PATH = "REPLACE_THIS"

class AzureBlobFileUploader:
  def __init__(self):
    print("Intializing AzureBlobFileUploader")

    # Initialize the connection to Azure storage account
    self.blob_service_client =  BlobServiceClient.from_connection_string(MY_CONNECTION_STRING)

  def upload_all_images_in_folder(self):
    # Get all files with jpg extension and exclude directories
    all_file_names = [f for f in os.listdir(LOCAL_IMAGE_PATH)
                    if os.path.isfile(os.path.join(LOCAL_IMAGE_PATH, f)) and ".jpg" in f]

    # Upload each file
    for file_name in all_file_names:
      self.upload_image(file_name)

  def upload_image(self,file_name):
    # Create blob with same name as local file name
    blob_client = self.blob_service_client.get_blob_client(container=MY_IMAGE_CONTAINER,
                                                          blob=file_name)
    # Get full path to the file
    upload_file_path = os.path.join(LOCAL_IMAGE_PATH, file_name)

    # Create blob on storage
    # Overwrite if it already exists!
    image_content_setting = ContentSettings(content_type='image/jpeg')
    print(f"uploading file - {file_name}")
    with open(upload_file_path, "rb") as data:
      blob_client.upload_blob(data,overwrite=True,content_settings=image_content_setting)


# Initialize class and upload files
azure_blob_file_uploader = AzureBlobFileUploader()
azure_blob_file_uploader.upload_all_images_in_folder()

Parallel Bulk Upload of Files to Azure Storage Blobs Using Python

The following python program is an improved version of the above program. This program uses a thread pool to upload a predefined number of images in parallel. Note that the program uses 10 as the thread pool count, but you can increase it for faster uploads if you have sufficient network bandwidth. If you don't specify content type, it will default to application/octet-stream.

# upload_blob_images_parallel.py
# Python program to bulk upload jpg image files as blobs to azure storage
# Uses ThreadPool for faster parallel uploads!
# Uses latest python SDK() for Azure blob storage
# Requires python 3.6 or above
import os
from multiprocessing.pool import ThreadPool
from azure.storage.blob import BlobServiceClient, BlobClient
from azure.storage.blob import ContentSettings, ContainerClient

# IMPORTANT: Replace connection string with your storage account connection string
# Usually starts with DefaultEndpointsProtocol=https;...
MY_CONNECTION_STRING = "REPLACE_THIS"

# Replace with blob container
MY_IMAGE_CONTAINER = "myimages"

# Replace with the local folder which contains the image files for upload
LOCAL_IMAGE_PATH = "REPLACE_THIS"

class AzureBlobFileUploader:
  def __init__(self):
    print("Intializing AzureBlobFileUploader")

    # Initialize the connection to Azure storage account
    self.blob_service_client =  BlobServiceClient.from_connection_string(MY_CONNECTION_STRING)

  def upload_all_images_in_folder(self):
    # Get all files with jpg extension and exclude directories
    all_file_names = [f for f in os.listdir(LOCAL_IMAGE_PATH)
                    if os.path.isfile(os.path.join(LOCAL_IMAGE_PATH, f)) and ".jpg" in f]

    result = self.run(all_file_names)
    print(result)

  def run(self,all_file_names):
    # Upload 10 files at a time!
    with ThreadPool(processes=int(10)) as pool:
      return pool.map(self.upload_image, all_file_names)

  def upload_image(self,file_name):
    # Create blob with same name as local file name
    blob_client = self.blob_service_client.get_blob_client(container=MY_IMAGE_CONTAINER,
                                                          blob=file_name)
    # Get full path to the file
    upload_file_path = os.path.join(LOCAL_IMAGE_PATH, file_name)

    # Create blob on storage
    # Overwrite if it already exists!
    image_content_setting = ContentSettings(content_type='image/jpeg')
    print(f"uploading file - {file_name}")
    with open(upload_file_path, "rb") as data:
      blob_client.upload_blob(data,overwrite=True,content_settings=image_content_setting)
    return file_name

# Initialize class and upload files
azure_blob_file_uploader = AzureBlobFileUploader()
azure_blob_file_uploader.upload_all_images_in_folder()

Using the ContentSettings object, it is possible to set content type, content encoding, content md5 or cache control for the blobs. See here for the full set of content_settings attributes.