SLA Plugin

PAID PLUGIN — SLA

A core ITOps plugin that turns raw service-status events into SLA telemetry: per-group uptime aggregation, error-budget tracking, daily JSON+PDF reports, and a dashboard widget on demo.mlops.hu. Activated by license; runs in-process inside itops-core.

What it is

The SLA Plugin is a paid feature that lives inside the ITOps core process. Whenever a monitored service flips to DEGRADED or OUTAGE, the plugin records an SLA incident, rolls up 5-minute uptime snapshots, computes the remaining error budget for each SLA group, and once a day generates a JSON + PDF report. None of this happens without the plugin — the underlying service_status_events table is still populated by the agent, but no aggregation runs and the SLA dashboard is hidden.

What you get vs the free tier

Architecture

agent (status push) -> service_status_events (core)
                          |
                          v
   SLA Plugin (paid) -- aggregates into sla_incidents, computes uptime
                          |
                          v
                   GraphQL: slaDashboardStats, slaPeriodResults, slaTrendData
                          |
                          v
                   Daily report job: generates JSON + PDF
                          |
                          v (optional)
                   Push to SLA Portal (sla.mlops.hu) -- separate service

The plugin runs as an in-process module of itops-core (no separate deployment). The dashboard widget and GraphQL queries are exposed by the same API server; the daily report job runs on a cron-like ticker inside the plugin's lifecycle.

Plugin gate

The plugin is activated when (a) the license JWT lists sla in its plugin scopes and (b) ITOPS_PLUGINS_SLA_ENABLED is not set to false. If either condition fails, the SLA queries return plugin_disabled, the dashboard widget is hidden, and the daily report job does not run. Note: service_status_events are still recorded by the core regardless — only the SLA-specific aggregation and reporting are gated.

Plugin independence

The Ticketing Plugin does not depend on the SLA Plugin. Both consume the same upstream signal — the core service_status_events table — but they aggregate it independently. You can enable Ticketing without SLA, SLA without Ticketing, both, or neither. Auto-tickets fired on DOWN are a Ticketing-Plugin feature; SLA incident records are an SLA-Plugin feature; they share the source events, not each other.

GraphQL queries the plugin exposes

REST endpoints

Helm config (excerpt)

plugins:
  sla:
    enabled: true        # turn the plugin on (also requires license scope)

# Per-service ConfigMap declares which SLA group it belongs to:
data:
  it-ops.yaml: |
    path: payment/api
    slaGroup: "payment-system"

Services without a slaGroup are tracked at the service level only and are not rolled up into any group target.

Daily report

Once per day (default 00:05 UTC) the plugin emits two artefacts to /var/log/itops/:

A webhook event sla.daily_report is fired on completion; subscribe from n8n, the SLA Portal, or any other consumer. The report job is the only piece of the plugin that writes to disk; everything else is in-memory + database.

The SLA Plugin pushes its daily report into the SLA Portal, a separate, public-facing status page microservice (itops/sla-portal chart) that customers reach at e.g. status.yourcompany.com. The Portal is part of the same paid SLA offering — on its own it has nothing to display, since uptime numbers, incident lists and maintenance windows all originate in the Plugin. Wiring is via ITOPS_SLA_PORTAL_URL + API key.

The Plugin and the Portal are deployed separately on different domains: the Plugin runs inside itops-core and powers the SLA dashboard on demo.mlops.hu; the Portal is a standalone microservice on sla.mlops.hu serving a public status page that customers can reach without a login. The Plugin can be used on its own (just the internal dashboard), but the Portal is only useful with the Plugin feeding it.

Read the full Portal documentation: SLA Portal →

See also