LAPP Design Notes

Aggregated highlights from paper reviews. Will be reorganized after all papers are reviewed.

From 01-lilac

  • Control plane (LLM) / data plane (cache tree) split
  • Prefix tree cache with wildcard matching for fast log-to-template lookup
  • Template merging: LCS similarity > 0.8 generalize differing tokens to <*>
  • Self-validation: LLM template must match its own source log or get rejected
  • Hierarchical candidate sampling + Jaccard similarity for ICL demonstration selection
  • Breakpoint resume for large datasets

From 02-ibm-label-broadcasting

  • Drain clustering first, then LLM on representatives only, broadcast labels back
  • Three label types: Golden Signal (error/availability/latency/saturation/info), Fault Category (app/network/IO), NER (host/session/error code)
  • Small fine-tuned BERT (BERTOps) runs on CPU, no GPU needed
  • 3.2% edge case: template variables carry diagnostic cues lost in templatization
  • Report types:
    • Summary: rarest lines first
    • Temporal Trend: golden signal over time when did it break?
    • Causal Graph: Granger causality on cluster time series how did fault propagate?
    • Diagnosis Report: fault-containing time windows only, searchable by entity
    • Workflow: Summary Temporal Causal Diagnosis

From 03-loghub-2.0

  • Dataset
  • future is semantic + global statistical hybrid model
  • feasible efficient approach: Drain, IPLoM, LogCluster / LogSig / LFA, UniParser / LogPPT

From 04-sok-llm-log-parsing

  • Recommended standard metrics for log parsing: GA, PA, FTA, NED
  • Only two LLM parsers clearly lead: LogBatcher, LILAC

From 05-l4-llm-training-log-diagnosis

  • Three log analysis patterns (all useful for LAPP Phase 2):
    • Cross-job: this run broke but last run was fine — diff them, new stuff is likely the cause
    • Spatial: most machines log the same thing, the odd one out is probably broken
    • Temporal: find which phase or iteration things went sideways
  • Domain-specific to LLM training, but the three patterns generalize well

From 06-wide-events-scuba

  • Scuba product UX is a future reference: pick log source, set filters, aggregate, render chart
  • Nice-to-have feature for LAPP, not core

From 07-observability-2.0

  • Nothing actionable for LAPP

From 08-drain3

  • Drain is an important algorithm for LAPP, likely need to implement it from scratch

From 11-logparser-llm

  • Essentially a better Drain, LLM-enhanced Drain
  • Prefix tree handles bulk, LLM only on new patterns (272 calls for 3.6M logs)

From 12-llmloganalyzer

  • Reference for future “chat with logs” feature
  • Key idea: cluster logs first to fit context window, then LLM summarizes/answers over clusters instead of raw logs

From 14-iknow-rag-chatbot

  • 5 user intent types for ops QA: symptom analysis (40.6%), multi-facet summary, terminology explanation, fact verification, operation guidance
  • 6 failure modes: incomplete query (32%), lacking knowledge (27%), out-of-scope (10%), invalid query (9%), retrieval issues, generation issues
  • Different intents need different query rewriting — important for LAPP “chat with logs” feature

From 26-logimprover and 27-sclogger

  • Future feature: if LAPP can access source code, auto-improve/inject logs to help profiling and exploration

From 29-logbatcher

  • AI-powered Drain, top parser alongside LILAC
  • TF-IDF + DBSCAN for clustering, beats embeddings — logs are structurally similar, token-level diff matters more