MongoDB Maintenance
Scheduled Maintenance Report for RebelMouse
Postmortem

Partial Editorial Tools Unavailability

Incident Date: Jan 14, 2025

Chronology of the incident (EST)

Jan 14, 2025 12:00 AM RebelMouse started the scheduled MongoDB Maintenance.

Jan 14, 2025 2:05 AM Indexes were created on the primary member of the MongoDB replica set.

Jan 14, 2025 3:14 AM RebelMouse received the report about increased error rate for the Editorial Tools.

Jan 14, 2025 3:20 AM RebelMouse DevOps team started to investigate the issue.

Jan 14, 2025 3:50 AM The issue was identified as increased replica lag between primary and one of the secondary members of the MongoDB replica set.

Jan 14, 2025 4:10 AM The identified secondary was hidden from the replica set. This action restored the functionality.

Jan 14, 2025 4:15 AM RebelMouse QA team confirmed the restored functionality.

Jan 14, 2025 5:00 AM Excluded replica synced with the primary, after that it was included back to the replica set.

The impact of the incident

During the incident, users experienced increased error rate for the editorial tools.

The underlying cause

The root cause of this incident was identified as increased replica lag between the primary and one of the secondary members of the MongoDB replica set. By default, MongoDB builds indexes simultaneously across all data-bearing members of the replica set. However, during the maintenance window, a backup process was initiated on one of the secondary members. This backup process delayed the index build operation. While the index creation was ongoing, the replication process was temporarily paused, contributing to the observed replica lag.

Actions taken & Preventive Measures

  1. The MongoDB backup schedule has been integrated into the change calendar to automatically identify and prevent conflicts with scheduled maintenance activities.
  2. The index creation process will now be scheduled during weekends to minimize the risk of service unavailability during peak usage hours. 
  3. We will implement additional alerts to promptly detect and address similar issues in the future.
Posted Jan 14, 2025 - 09:24 EST

Completed
The scheduled maintenance has been completed.
Posted Jan 14, 2025 - 03:00 EST
In progress
Scheduled maintenance is currently in progress. We will provide updates as necessary.
Posted Jan 14, 2025 - 00:00 EST
Scheduled
During the maintenance we will be creating new indexes in our MongoDB. These indexes are designed to enhance query response times and contribute to overall system stability. The update process is expected to be seamless, with no downtime anticipated.
Posted Jan 09, 2025 - 14:22 EST
This scheduled maintenance affected: Mongo Cluster.