How Do I Delete Orphaned Documents in MongoDB Sharded Clusters?

What Is Orphaned Document?

In a sharded cluster, orphaned documents are those documents on a shard that also exist in chunks on other shards as a result of failed migrations or incomplete migration cleanup due to abnormal shutdown.

Migration Impact

During cluster migration, DRS extracts full data from shards. Normal documents and orphaned documents are on different shards and DRS will migrate them all. If the conflict policy of DRS migration from MongoDB to DDS is Ignore, documents that are first migrated to the destination are stored, resulting in data inconsistency.

Procedure

  1. Contact technical support to obtain the cleanupOrphaned script for deleting orphaned files and decompress the script.

  2. Modify the cleanupOrphaned.js script file and replace test with the database name of the orphaned document to be cleared.

  3. Run the following command to clear the orphaned documents of all collections in the specified database on the shard node:

    mongo --host ShardIP --port Primaryport --authenticationDatabase database -u username -p password cleanupOrphaned.js
    

    Note

    • ShardIP: indicates the IP address of the shard node.

    • Primaryport: indicates the service port of the primary shard node.

    • database: indicates the database name.

    • username: indicates the username for logging in to the database.

    • password indicates the password for logging in to the database.

    Note

    If you have multiple databases, repeat 2 and 3 to clean up orphaned documents in each database on each shard node.