Incident Date: Jan 14, 2025
Jan 14, 2025 12:00 AM RebelMouse started the scheduled MongoDB Maintenance.
Jan 14, 2025 2:05 AM Indexes were created on the primary member of the MongoDB replica set.
Jan 14, 2025 3:14 AM RebelMouse received the report about increased error rate for the Editorial Tools.
Jan 14, 2025 3:20 AM RebelMouse DevOps team started to investigate the issue.
Jan 14, 2025 3:50 AM The issue was identified as increased replica lag between primary and one of the secondary members of the MongoDB replica set.
Jan 14, 2025 4:10 AM The identified secondary was hidden from the replica set. This action restored the functionality.
Jan 14, 2025 4:15 AM RebelMouse QA team confirmed the restored functionality.
Jan 14, 2025 5:00 AM Excluded replica synced with the primary, after that it was included back to the replica set.
During the incident, users experienced increased error rate for the editorial tools.
The root cause of this incident was identified as increased replica lag between the primary and one of the secondary members of the MongoDB replica set. By default, MongoDB builds indexes simultaneously across all data-bearing members of the replica set. However, during the maintenance window, a backup process was initiated on one of the secondary members. This backup process delayed the index build operation. While the index creation was ongoing, the replication process was temporarily paused, contributing to the observed replica lag.