Skip to content
@kube-rca

KubeRCA

KubeRCA Logo

KubeRCA

AI-powered Kubernetes incident analysis and Root Cause Analysis

License Latest release Stars CI Go Python React Helm

Turn Kubernetes alerts into actionable root-cause analysis — in seconds, not hours.

If KubeRCA looks useful, please consider starring the main repository. It helps the project reach more operators and brings in more contributors.


See It In Action

A few screens from a running KubeRCA install. Dashboards correlate alerts to incidents, and every incident gets an LLM-generated RCA summary that lands in Slack and the UI together.

Incident Dashboard

KubeRCA Incident Dashboard

Alert Dashboard

KubeRCA Alert Dashboard

Slack — AI Analysis in Thread

KubeRCA Slack Integration with AI Analysis

Incident Detail — Full RCA Report

KubeRCA Incident detail with full RCA report

Alert Detail — Per-Alert AI Analysis

KubeRCA Alert detail with per-alert AI analysis


Why KubeRCA

KubeRCA is an open-source tool that turns Kubernetes alerts into actionable incident context, AI-assisted analysis, and operator workflows.

It is built for the gap between "an alert fired" and "we understand what happened." Instead of manually gathering evidence across Kubernetes, observability tools, chat, and dashboards, KubeRCA connects alert intake, RCA generation, Slack delivery, and incident search into one operator-facing flow.

When To Use It

KubeRCA is a strong fit for teams that already use Alertmanager, want more consistent RCA, and need searchable incident history instead of one-off alert handling.

Best Fit

  • Kubernetes environments with Alertmanager-based alerting
  • Teams using Slack threads or dashboards during incident triage
  • Workloads where recurring incidents benefit from historical reuse
  • Organizations that want LLM-assisted triage without replacing their existing stack

Not Optimized For

  • Log-only workflows without structured alerts
  • Fully autonomous remediation expectations
  • Generic APM replacement use cases

How It Works

flowchart TD
  AM[Alertmanager]
  SL[Slack]
  LLM[LLM Provider]
  K8S[Kubernetes API]
  PR[Prometheus]
  TP[Tempo]

  subgraph KubeRCA
    FE[Frontend]
    BE[Backend]
    AG[Agent]
    DB[(PostgreSQL + pgvector)]
  end

  AM -->|Webhook| BE
  FE <-->|REST + SSE| BE
  BE -->|Analyze / Summarize / Chat| AG
  BE -->|Thread notifications| SL
  BE <-->|Incidents / alerts / embeddings| DB
  AG -->|Cluster context| K8S
  AG -->|Metrics| PR
  AG -.->|Trace context| TP
  AG -->|Inference| LLM
Loading

Operator Flow

  1. Alertmanager sends alerts to the Backend.
  2. Backend creates or updates incidents and stores alert history.
  3. Agent collects Kubernetes and observability context, then runs RCA with an LLM provider.
  4. Results are published to Slack and streamed to the dashboard.
  5. Operators can resolve incidents, manually resolve alerts, search similar incidents, leave feedback, and use in-app chat.

Read the full runtime walkthrough in the Architecture Details.

Key Capabilities

Detection To RCA

  • Alert-driven incident intake through Alertmanager
  • Kubernetes and observability context collection
  • Multi-provider RCA with gemini, openai, and anthropic

Operator Workflows

  • Slack thread delivery for incident and RCA updates
  • Realtime dashboard sync with SSE
  • Manual resolve, feedback, webhook settings, and context-aware chat

Search And Deployment

  • Similar incident search with PostgreSQL + pgvector
  • Local auth and Google OIDC support
  • Helm-based deployment for Kubernetes environments

Quick Evaluation

1. Install The Stack

helm upgrade --install kube-rca oci://public.ecr.aws/r5b7j2e4/kube-rca-ecr/charts/kube-rca \
  --namespace kube-rca --create-namespace \
  -f values.yaml

2. Connect Alertmanager

Point your Alertmanager receiver at:

http://kube-rca-backend.kube-rca.svc.cluster.local:8080/webhook/alertmanager

3. Walk Through The First Incident

  • Trigger or forward an alert
  • Verify analysis arrives in the dashboard
  • Enable Slack if you want threaded incident delivery

For installation details and step-by-step setup, use the documents below.

Documentation

Community

  • GitHub Discussions — questions, ideas, and proposals
  • Issues — bug reports and feature requests (use the templates)
  • Security — private vulnerability reporting

Contributing

Issues, pull requests, and design feedback are all welcome. Before opening a PR, please read:

⭐ Liked what you saw? A star on the main repository is the simplest way to help the project grow.

License

This project is licensed under the Apache License, Version 2.0. See LICENSE and NOTICE for details.

Made for Kubernetes operators who need faster incident context and RCA

Popular repositories Loading

  1. kuberca kuberca Public

    KubeRCA - Kubernetes Alert Root Cause Analysis

    Go 10

  2. backend backend Public archive

    Go

  3. frontend frontend Public archive

    TypeScript

  4. helm-charts helm-charts Public

    Go Template

  5. .github .github Public

  6. agent agent Public archive

    Python

Repositories

Showing 6 of 6 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…