LogPilot: Intent-aware and Scalable Alert Diagnosis

Takeaway

Alert diagnosis, not log parsing — given an alert (e.g., PromQL firing), auto-find root cause from logs
Intent-aware: reads the alert definition (PromQL query) to understand what the alert cares about, then scopes which logs/requests are relevant
Builds “spatiotemporal log chains” per request (trace-like reconstruction from logs), then clusters similar chains to find patterns
Clustering keeps LLM input compact: send representative samples instead of all logs
Results: 50% better root cause summaries, 55% better exact localization vs baselines
Fast and cheap: under 1 min per alert, $0.074 per diagnosis
For LAPP: the intent-aware scoping idea is interesting — if we know what the user cares about (alert definition), we can narrow down which logs to analyze instead of boiling the ocean