How to Calculate MD5 Hash of a File in Python

MD5 is a message digest algorithm used to create a unique fixed size value from variable input data. MD5 is commonly used to check whether a file is corrupted during transfer or not (in this case the hash value is known as checksum). Any change in the file will lead to a different MD5 hash value.

The following Python program computes MD5 hash of a given file. The computed 128 bit MD5 hash is converted to readable hexadecimal form.

# Python program to find MD5 hash value of a file
import hashlib

filename = input("Enter the file name: ")
with open(filename,"rb") as f:
    bytes = f.read() # read file as bytes
    readable_hash = hashlib.md5(bytes).hexdigest();
    print(readable_hash)

Note that the above program may fail for large input files since we read the entire file to memory before computing the MD5 hash. The following python program is an improved version capable of handling large files,

# Python program to find MD5 hash value of a file
import hashlib

filename = input("Enter the file name: ")
md5_hash = hashlib.md5()
with open(filename,"rb") as f:
    # Read and update hash in chunks of 4K
    for byte_block in iter(lambda: f.read(4096),b""):
        md5_hash.update(byte_block)
    print(md5_hash.hexdigest())

Here is the above program is action,

python3 md5hash2.py
Enter the file name: md5hash2.py
0101ae2ac06b8a52154100e37d8bafea