LogBatcher: Demonstration-Free Log Parsing with LLMs

Takeaway

  • AI-powered Drain: the best LLM log parser alongside LILAC, but unsupervised and demo-free
  • TF-IDF + DBSCAN for clustering, better than embeddings — logs are structurally similar by nature, token-level diff matters more
  • GA 0.972, MLA 0.895 on 16 datasets, outperforms LILAC
  • Batch-based: groups similar logs, parses representative ones, broadcasts results
  • No labeled demos needed, no hyperparameter tuning