Description
This Support KB Article provides details on the below error seen when using fully managed MongoDB Atlas Sink connector on Confluent Cloud:
Failed to write mongodb documents with duplicate key. If the collection contains unique constraint, please ensure that the field corresponding to that constraint in the kafka records are unique.
Applies To
Fully managed MongoDB Atlas Sink connector
Confluent Cloud
Cause
If you encounter the above error, it means that there is a duplicate key violation in the collection. MongoDB applies unique constraints on indexed fields, including custom unique indexes.
If you attempt to insert or update a document with a value that already exists in the unique index field, the operation will fail and trigger the above error.
Resolution
In order to resolve this issue, you need to identify the duplicate record based on the MongoDB collection definition.
Based on the Document ID Strategy used, you will generate a unique document ID denoted by: _id.
Now, from the associated topic, if there already exists a record (on MongoDB end) which is being processed by the connector and sent to MongoDB such that either the same Key from that record (or) the _id Value already exists on MongoDB side, then you need to identify that record and make sure that the connector does not process it again.
At present, the fully managed version of this connector does not display the exact record which has a duplicate key, for example: collection: $CollectionName index: $Index dup key: { _id: "$VALUE" }'
Hence, you need to identify the duplicate record causing this error manually, once you identify the same you should not try to insert the same record again.
Note: One way to identify for a duplicate record manually would be to check the last committed offset on the topic-partitions by taking a look at the output of the Consumer Group DESCRIBE which will show the last committed offsets and then check the next ~500 records and compare it with the MongoDB collection data