Service discovery, SLA monitoring, incident management — all driven from your Helm chart. Deploy in 1 hour, not 6 months.
Prometheus tells you what happened. ITOps tells you what to do about it.
Everything a platform team needs — integrated, not stitched together.
K8s agent watches ConfigMaps with itops.io/config label. Services appear automatically — zero manual entry.
5-minute snapshots from every agent. Async aggregation with 15-min delay. Daily + monthly uptime% per service and group.
Service goes DOWN → SLA incident created → INCIDENT ticket generated → dashboard updates. Under 30 seconds.
Universal webhook: any backup tool calls POST /api/v1/backup/report. Alerts if backup older than maxAgeDays.
Full ticket lifecycle with visual workflow builder. 17 step types, versioning, templates. SLA timers on every ticket.
Role-based access with field-level visibility control. LDAP, Azure AD, Okta, Google SSO. Audit trail on everything.
Three steps from zero to full ops management.
Install the ITOps agent on each K8s cluster via Helm.
helm repo add itops https://charts.mlops.hu
helm install itops-agent itops/itops-agent \
--set node.id="org/prod/cluster1" \
-n itops
Add itops: block to your service's values.yaml. GitOps handles the rest.
itops:
criticality: "critical"
slaGroup: "payment-system"
backup:
expected: true
maxAgeDays: 1
Dashboard shows real-time status. SLA measured. Incidents auto-generated. Tickets auto-created.
Agent sync (30s)
→ sla_snapshots
→ Aggregator (15min delay)
→ Uptime % calculated
→ Dashboard updated
Multi-agent, multi-cluster, single pane of glass.
Self-hosted. Your data stays yours.
1 cluster, 5 services
5 clusters, unlimited services
Unlimited everything
Deploy in under 1 hour. No credit card required.