How to Create and Use Indexes in MongoDB Database

MongoDB supports creation of indexes to speed up queries for specific application needs. With suitable indexes, MongoDB doesn’t need to scan every document before returning results. This substantially speeds up queries especially for large collections where ordering and filtering is needed. MongoDB automatically creates a uniq index on the _id field during the creation of the collection. All other indexes must be manually created.

MongoDB indexes are created on a single field or a combination of fields and you can specify ascending or descending order for the index. In the following example I will use a typical use of a collection which is used to store customer data of a CRM system. First let us see how adding a simple index on a single collection field can substantially improve query performance.

Creating and Using Single Field Indexes in MongoDB

Our application uses a collection named customer to store customer data. Let us first create the collection using MongoDB JavaScript commands,

// create a customer collection
db.createCollection("customer");

Let us now insert some test data into the collection. The following script uses a loop to create 100,000 customer records in the collection. Each record contains customer name and age. We simulate different ages using modulus operator on the loop index. This ensures that customer ages are between 10 and 100.

// Create a large customer data set!
for(i=1;i<=100000;i++) {
    db.customer.insert({ name:"jack"+i, age:i%90+10 });
};

Let us now run a query to get the list of customers in the ascending order of their age,

// display all customers in ascending order of age. 
db.customer.find().sort( { age: 1 } )

In my machine, the above query took about 0.22 seconds. For a single query this may not seem much, but when multiple queries are run by different users, this becomes a major bottleneck on the database.

Let us now create an ascending index on the age column which we are using in the above sort query.

// We create an index on age to speed up sorted query on age
db.customer.createIndex( { age: 1 } )

Now let us now run the sort query again.

// display all customers in ascending order of age. This uses our index.
db.customer.find().sort( { age: 1 } )

This time the query took about 0.002 seconds to run. The query is now more than 100 times faster! Hence it is very important that you have proper indexes on your MongoDB tables especially when you are building a system for large number of application users with fairly large data. The advantage is substantially high when you have large documents in the collection.

Note that in the above example the order of the index doesn’t matter. This is because for single field indexes like the one above on age, MongoDB can traverse the index in either direction. However order of index does matter in the case of compound indexes containing multiple fields.

Creating and Using Multi-Field Indexes in MongoDB

Let us know explore a much complex scenario involving compound indexes and multiple sort columns. We will start with an application scenario and see how multiple compound indexes can improve queries substantially.

In our e-commerce application, we have a collection for storing all the retail stores of the company. For simplicity, assume that the collection contains store name, store country and store city. If you consider this as a global retailer, there will be thousands of stores in the collection. Let us create the collection and then add 50,000 stores with random data,

// create a store collection
db.createCollection("store");
// Create a large store data set! 
//For the test data, we randomly create 2 letter strings 
//for country names and randomly create 8 letter strings 
//for city names.
for(i=1;i<=50000;i++) {
    var charset = "abcdefghijklmnopqrstuvwxyz";
    var city = ""; var country="";
    for( var j=0; j < 2; j++ )
        country += charset[Math.floor(Math.random() * charset.length)];
    for( var j=0; j < 8; j++ )
        city += charset[Math.floor(Math.random() * charset.length)];
    db.store.insert({ name:"store"+i, country:country, city: city });
};

In our e-commerce application, we are displaying the list of stores which is sorted first with country name and then with city name in ascending alphabetical order. The query is given below,

db.store.find().sort( { country: 1, city:1 } )

This query takes about 0.18 seconds in my machine. To improve the query performance, I added the following compound index which contains both country and city. I also indicate that the sort order for index is ascending for both fields.

db.store.createIndex( { country: 1, city:1 } )

With this index in place, my original query is run again,

db.store.find().sort( { country: 1, city:1 } )

This time the query taken less than 0.003 seconds! A huge improvement. Now application requires the store listing to be first sorted on the descending order of country followed by descending order of city name. The new query is given below,

db.store.find().sort( { country: -1, city:-1 } )

This time also the query taken less than 0.003 seconds! This means our index is also working even when reverse sort oder for both fields.

Now let us try running the query with ascending order of country followed by descending order of city name.

db.store.find().sort( { country: 1, city:-1 } )

This time query performance is back to non indexed behavior. Our compound index is no longer working for this scenario. We need to create another compound index given below,

db.store.createIndex( { country: 1, city:-1 } )

With this index also in place, now the queries are faster even with all 4 combinations of the sort order for country name and city name.

Now let us see what happens if you use only one of the fields of compound index in your sort order.

db.store.find().sort( { country: 1 } )

It is fast! This query also uses compound index even when the sort query uses only country field. This is because in MongoDB, compound indexes work even when only index prefixes(one or more first parts of a compound index) are used in the sort query.

However, the following queries are still slower since you are running the sort query on the second city name field (NOT an index prefix).

db.store.find().sort( { city: 1 } )
db.store.find().sort( { city: -1 } )

For the above query to work faster, you need to add the following index,

db.store.createIndex( { city:1 } )

So if we need all sort order combinations in our application, we will need the following indexes,

db.store.createIndex( { city:1 } )
db.store.createIndex( { country: 1, city:-1 } )
db.store.createIndex( { country: 1, city:1 } )

MongoDB supports following types of indexes – single field, compound index, multi-key index, geospatial index, text index and hashed index. Indexes can have different properties as well. See this page for more details. Also note that any kind of index increases storage needs of your MongoDB instance.

Please note that all the examples given above are tested on MongoDB 3.6. These scripts should also work if you are using MongoDB 3.6 API on a CosmosDB database. In the earlier version of CosmosDB (using MongoDB API 3.2), all top level fields were auto indexed. But this is no longer the case in newer CosmosDB versions. Hence it is important to identify and create indexes as needed in your application features.

Comments are closed.