#14 — CPU alert on claw-gateway1 — 99.0% (threshold: 97%)

Timeline

⬡

WEBHOOK

2026-07-26 16:33 UTC

Alert received from AI Infra Monitor. Host: claw-gateway1, Severity: HIGH

◎

CONTEXT AGGREGATED

2026-07-26 16:33 UTC

Sources available: 3/3 — Runbook: ✓ | Past incidents: ✓ | Infra health: ✓

✦

Response Plan

2026-07-26 16:33 UTC

Severity

P1 – Gateway host saturated at 99% CPU with 95% memory; service latency/unavailability imminent.

SSH to claw-gateway1; run top -b -n 1 | head -20 and ps aux --sort=-%cpu | head -10 to identify top process.
Kill or throttle non-critical process; if critical, scale horizontally (add replica).
Check memory: if >95% and swapping, restart the affected service.
Monitor CPU drop for 2 min; if no improvement, prepare for reboot.
Reboot claw-gateway1 only if CPU remains >90% after 3 min.

CPU remains >85% after killing top process or reboot does not stabilize within 5 minutes.

△

STATUS CHANGE

2026-07-26 16:50 UTC

OPEN -> RESOLVED (closed after validation; public manual form disabled and host pressure was transient)

Update Status

Context Sources

RAG Runbook fetched ↗

Incident Logger fetched ↗

Details

ID #14

Severity HIGH

Source infra_monitor

Status RESOLVED

Opened 2026-07-26 16:33