SLA Plugin
PAID PLUGIN — SLAA core ITOps plugin that turns raw service-status events into SLA telemetry: per-group uptime aggregation, error-budget tracking, daily JSON+PDF reports, and a dashboard widget on demo.mlops.hu. Activated by license; runs in-process inside itops-core.
What it is
The SLA Plugin is a paid feature that lives inside the ITOps core process. Whenever a monitored service flips to DEGRADED or OUTAGE, the plugin records an SLA incident, rolls up 5-minute uptime snapshots, computes the remaining error budget for each SLA group, and once a day generates a JSON + PDF report. None of this happens without the plugin — the underlying service_status_events table is still populated by the agent, but no aggregation runs and the SLA dashboard is hidden.
What you get vs the free tier
- Per-service-group uptime tracking — only with the SLA Plugin.
- Error budget burn rate — only with the SLA Plugin.
- Daily SLA report (JSON + PDF) — only with the SLA Plugin.
- SLA dashboard widget on
demo.mlops.hu— only with the SLA Plugin. - Free tier — service-status push API + agent monitoring (status events only, NO SLA aggregation, NO report job, NO dashboard widget).
Architecture
agent (status push) -> service_status_events (core)
|
v
SLA Plugin (paid) -- aggregates into sla_incidents, computes uptime
|
v
GraphQL: slaDashboardStats, slaPeriodResults, slaTrendData
|
v
Daily report job: generates JSON + PDF
|
v (optional)
Push to SLA Portal (sla.mlops.hu) -- separate service
The plugin runs as an in-process module of itops-core (no separate deployment). The dashboard widget and GraphQL queries are exposed by the same API server; the daily report job runs on a cron-like ticker inside the plugin's lifecycle.
Plugin gate
The plugin is activated when (a) the license JWT lists sla in its plugin scopes and (b) ITOPS_PLUGINS_SLA_ENABLED is not set to false. If either condition fails, the SLA queries return plugin_disabled, the dashboard widget is hidden, and the daily report job does not run. Note: service_status_events are still recorded by the core regardless — only the SLA-specific aggregation and reporting are gated.
Plugin independence
The Ticketing Plugin does not depend on the SLA Plugin. Both consume the same upstream signal — the core service_status_events table — but they aggregate it independently. You can enable Ticketing without SLA, SLA without Ticketing, both, or neither. Auto-tickets fired on DOWN are a Ticketing-Plugin feature; SLA incident records are an SLA-Plugin feature; they share the source events, not each other.
GraphQL queries the plugin exposes
slaDefinitions,slaTargets,slaGroups— declared targets and group metadata.slaPeriodResults,slaTrendData,slaSnapshotTrend— aggregated uptime over time windows.slaDashboardStats,slaIncidents,slaAlerts— live dashboard data.slaExclusionWindows— planned downtime, excluded from uptime math.
REST endpoints
POST /api/v1/health/report— external services push status (also used by the free tier).POST /api/v1/exclusion-window/start,POST /api/v1/exclusion-window/stop— mark a planned-downtime window so it does not eat into the error budget.POST /api/v1/sla/report/generate— manually trigger a daily report run (otherwise scheduled).
Helm config (excerpt)
plugins:
sla:
enabled: true # turn the plugin on (also requires license scope)
# Per-service ConfigMap declares which SLA group it belongs to:
data:
it-ops.yaml: |
path: payment/api
slaGroup: "payment-system"
Services without a slaGroup are tracked at the service level only and are not rolled up into any group target.
Daily report
Once per day (default 00:05 UTC) the plugin emits two artefacts to /var/log/itops/:
sla-report-YYYY-MM-DD.json— machine-readable summary (per-group uptime, incident list, error-budget remaining).sla-report-YYYY-MM-DD.pdf— printable customer-facing variant.
A webhook event sla.daily_report is fired on completion; subscribe from n8n, the SLA Portal, or any other consumer. The report job is the only piece of the plugin that writes to disk; everything else is in-memory + database.
Related: SLA Portal
The SLA Plugin pushes its daily report into the SLA Portal, a separate, public-facing status page microservice (itops/sla-portal chart) that customers reach at e.g. status.yourcompany.com. The Portal is part of the same paid SLA offering — on its own it has nothing to display, since uptime numbers, incident lists and maintenance windows all originate in the Plugin. Wiring is via ITOPS_SLA_PORTAL_URL + API key.
The Plugin and the Portal are deployed separately on different domains: the Plugin runs inside itops-core and powers the SLA dashboard on demo.mlops.hu; the Portal is a standalone microservice on sla.mlops.hu serving a public status page that customers can reach without a login. The Plugin can be used on its own (just the internal dashboard), but the Portal is only useful with the Plugin feeding it.
Read the full Portal documentation: SLA Portal →
See also
- SLA Portal — the standalone public status page service.
- Ticketing Plugin Overview — the other paid plugin (independent of SLA).
- Ticketing GitOps Setup — declarative ticket workflow + catalog config.
- API reference — full GraphQL schema.