For most organizations, the main reason for using a NoSQL database is the ability to deal with the storage and compute demands of storing and querying large amounts of data. MongoDB Sharding is the method by which MongoDB handles large amounts of data.
It can be seen as the process in which large datasets are split into smaller datasets that are stored across multiple MongoDB Instances. This is done because querying on large datasets could lead to high CPU utilization on the MongoDB Server.
The following image shows the structure of a MongoDB Database:
A sizable number of Collections make up each MongoDB Database. Each Collection is composed of numerous documents that contain data in the form of Key-Value pairs. A huge Collection is divided into several smaller Collections, known as Shards, through MongoDB Sharding. Large Collections can be divided into smaller units called “Shards” so that MongoDB can process queries while not overtaxing the server.
MongoDB Sharding can be implemented by creating a Cluster of MongoDB Instances. The following image shows how MongoDB Sharding works in a Cluster.
The three main components of Sharded Cluster are as follows:
Shard is the most basic unit of a Shared Cluster that is used to store a subset of the large dataset that has to be divided. Shards are designed in such a way that they are capable of providing high data availability and consistency.
2) Config Servers
Config Servers are supposed to store the metadata of the MongoDB Sharded Cluster. This metadata consists of information about what subset of data is stored in which Shard. This information can be used to direct user queries accordingly. Each Sharded Cluster is supposed to have exactly 3 Config Servers.
3) Query Routers
Query Routers can be seen as Mongo Instances that form an interface to the client applications. The Query Routers are responsible for forwarding user queries to the right Shard.
Benefits of MongoDB Sharding
MongoDB Sharding is important because of the following reasons:
Steps to Set up MongoDB Sharding
MongoDB Sharding can be set up by implementing the following steps:
Step 1: Creating a Directory for Config Server
The first step is performed in order to set up MongoDB Sharding would be to create a separate directory for Config Server. This can be done using the following command:
Step 2: Starting MongoDB Instance in Configuration Mode
One Server has to be set up as the Configuration Server. Suppose you have a Server named “ConfServer” which would be used as the Configuration Server, the following command can be executed to perform that operation:
mongod –configdb ConfServer: 27019
Step 3: Starting Mongos Instance
The Mongos Instance can be launched after the Configuration Server has been configured by running the following command and adding the name of your Configuration Server:
mongos –configdb ConfServer: 27019
Step 4: Connecting to Mongos Instance
A connection can be formed to the Mongos Instance by running the following command from the Mongo Shell:
mongo –host ConfServer –port 27017
Step 5: Adding Servers to Clusters
All Servers that have to be included in the Cluster can be added by the following command:
“SA” here has to be replaced with the name of your Server that has to be added to the Cluster. This command can be executed for all Servers that have to be added to the Cluster.
Step 6: Enabling Sharding for Database
Sharding for the necessary database must be enabled after the Sharded Cluster has been configured. The command below can be used to achieve this:
In the above command, “db_test” has to be replaced with the name of the database that you wish to Shard.
Limitations of MongoDB Sharding
The limitations of MongoDB Sharding are as follows: