fix: reset push monitor retries when UP heartbeat is received#7081
fix: reset push monitor retries when UP heartbeat is received#7081viceice wants to merge 1 commit intolouislam:masterfrom
Conversation
|
Hello and thanks for lending a paw to Uptime Kuma! 🐻👋 |
There was a problem hiding this comment.
Pull request overview
This PR aims to ensure push monitors don’t keep accumulating retry counts after a successful heartbeat, by persisting a retry reset when a valid heartbeat is detected during the push monitor’s scheduled check loop.
Changes:
- Persist retry reset for push monitors by writing
retries = 0back to the last successful heartbeat record when the heartbeat is within the allowed time window.
| // Update the previous heartbeat's retries to 0 to persist the reset | ||
| if (previousBeat.retries > 0) { | ||
| previousBeat.retries = 0; | ||
| await R.store(previousBeat); |
There was a problem hiding this comment.
This reset persists by updating previousBeat, but in this code path previousBeat is selected via ORDER BY time DESC (see earlier in start()), while heartbeat.time is defined as a plain DATETIME in the schema (no millisecond precision). On MariaDB/MySQL this can result in multiple heartbeats sharing the same time value within the same second, making ORDER BY time DESC non-deterministic and potentially causing the wrong row to be updated (so the retry counter may still not reliably reset). Consider selecting/updating the latest heartbeat deterministically (e.g., ORDER BY time DESC, id DESC or using the same MAX(id) approach as Monitor.getPreviousHeartbeat()) before persisting the retries reset.
| // Update the previous heartbeat's retries to 0 to persist the reset | |
| if (previousBeat.retries > 0) { | |
| previousBeat.retries = 0; | |
| await R.store(previousBeat); | |
| // Update the latest heartbeat's retries to 0 to persist the reset | |
| if (previousBeat.retries > 0) { | |
| const latestBeat = await R.findOne( | |
| "heartbeat", | |
| " monitor_id = ? ORDER BY time DESC, id DESC ", | |
| [ this.id ] | |
| ); | |
| if (latestBeat && latestBeat.retries > 0) { | |
| latestBeat.retries = 0; | |
| await R.store(latestBeat); | |
| } |
mahdirajaee
left a comment
There was a problem hiding this comment.
This addresses a real bug -- if a push monitor accumulates retries and then receives a healthy heartbeat, the retry counter should reset. Without this fix, the stale retry count persists in the database and can cause incorrect behavior on subsequent timeout evaluations.
The implementation looks correct. A couple of observations:
-
The conditional
if (previousBeat.retries > 0)is a good guard to avoid unnecessary writes to the database on every successful heartbeat. Only persisting when there is actually something to reset keeps the I/O minimal. -
One thing to verify: is there a potential race condition if two heartbeats arrive in rapid succession? The
previousBeatreference is loaded earlier in the flow, so if a concurrent heartbeat modifies it between the read and thisR.store(), you could get a stale write. In practice this is likely fine for push monitors since they are externally driven and unlikely to overlap, but worth thinking about. -
The in-memory
retries = 0reset was already present before this PR. The key addition here is persisting that reset to the database viapreviousBeat.retries = 0+R.store(), which is the actual fix for the issue.
Solid fix. Would be good to have a test case that verifies the retry count resets in the database after a successful push heartbeat, as the author mentioned in the PR description.
Summary
In this pull request, the following changes are made:
UP( orDOWNif upside down) and persists it.Please follow this checklist to avoid unnecessary back and forth (click to expand)
I understand that I am responsible for and able to explain every line of code I submit.
Screenshots for Visual Changes
not applicable