How to Find Largest Document in MongoDB

MongoDB collections are usually schema less. This means that a collection can potentially contain documents with different structure. Even if they are same, since data stored is different, document size can substantially vary. Sometimes you may want to find the largest document in a MongoDB collection. This is required if you are planning to migrate your MongoDB database to a cloud database such as Azure CosmosDB which has restrictions on the maximum size of a single document in a collection (currently it is 2MB).

How to Find the Largest Document in a MongoDB Collection?

The following script finds the largest document in a collection. The following example uses the “user” collection in my database. Replace it with your collection name.

var max = 0, id = null;

// iterate through each document
db.store.find().forEach(doc => {
    var size = Object.bsonsize(doc);
    if(size > max) {
        max = size;
        id = doc._id;
    }
});

// document id and size of the largest doc in bytes
print(""+id+", "+max);

Note that the above script loads every document in the collection to compute its size. Hence this can be very slow in large collections.

How to Find the Largest Document in a MongoDB Database Across Collections?

The following script finds the largest document across all collections in the MongoDB database. It prints the collection name, document id and size in bytes as output. For large databases, this may take minutes or even hours. Also note that it loads every document to compute size and hence can incur lot of network traffic.

var max = 0, id = null, c = null ;

// Iterate through all the collections!
db.getCollectionNames().forEach(function(cName) {

    // Now iterate through each document in the collection
    db[cName].find().forEach(doc => {
        var size = Object.bsonsize(doc);
        if(size > max) {
            max = size;
            id = doc._id;
            c = cName
        }
    });
});

// Collection name, document id and size of the largest doc in bytes
print(c+", "+id+", "+max);

If you are using MongoDB 4.4 or later, you can simply use the new aggregation operator $bsonSize to find the size of the document. It will substantially speed up the computation.

Comments are closed.