# ITOps — GitOps Schema Walkthrough for AI Assistants

## One mental model

Each service owns a ConfigMap. The agent discovers ConfigMaps labelled
`itops.io/config: "true"` and reads the `it-ops.yaml` data key. You write YAML, Git is
the source of truth, the ITOps UI is a read-only reflection.

## Minimum viable

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-app-itops
  labels:
    itops.io/config: "true"
data:
  it-ops.yaml: |
    path: "myorg/myplatform/prod/cluster1/my-app"
```

One line. The 5-level `path` is the service's global identifier.

## The complete field reference

Every field below is optional. Add a line per feature you want.

### Top-level identity + presentation

| Field | Type | Default | Effect |
|---|---|---|---|
| `path` | string | **required** | `org/platform/env/cluster/service` |
| `displayName` | string | titlecased last segment | UI label |
| `description` | string | chart description | service detail body |
| `type` | string | inferred from tags or "service" | icon + meta row |
| `tags` | string[] | `[]` | filter + storage/backup tab routing |
| `team` | string | "" | Team column + ownership block |

### SLA

| Field | Type | Default | Effect |
|---|---|---|---|
| `slaGroup` | string | "" | cross-cluster group membership |
| `criticality` | enum | "medium" (else group tier) | default SLA tier |

### Workload (K8s lookup)

```yaml
workload:
  type: deployment          # deployment | statefulset | daemonset
  name: my-release-my-app   # default: last path segment
```

### Backup monitoring

```yaml
backup:
  expected: true
  maxAgeDays: 1
  storageSize: "100Gi"
  schedule: "0 2 * * *"
  retention: "30d"
```

### HTTP health probe

```yaml
health:
  enabled: true
  port: 8080
  path: "/healthz"
  interval: "30s"
  timeout: "5s"
  # endpoint: "http://absolute-url:9000/path"  # override
```

Agent probes the target in-cluster, 2xx overrides K8s-native status to OPERATIONAL.

### Dependencies (path-based refs)

```yaml
dependencies:
  requires:
    - path: "myorg/myplatform/prod/cluster1/postgresql"
      type: storage
      critical: true
  usedBy:
    - path: "myorg/myplatform/prod/cluster1/web-frontend"
      type: frontend
```

Each reference is ONE 5-level path — no separate `name` + `nodePath` pair.

### Links (buttons on service detail)

```yaml
links:
  runbook:       "https://..."
  dashboard:     "https://..."
  documentation: "https://..."
  repository:    "https://..."
  logs:          "https://..."
  alerts:        "https://..."
  api:           "https://..."
  custom:        "https://..."
```

### Contacts (merged into ownership)

```yaml
contacts:
  owner:      "alice@example.com"
  slack:      "#backend-oncall"
  contact:    "+36-30-1234567"
  escalation: "cto@example.com"
```

### Relations (free-form parent/children)

```yaml
relations:
  parent: "retail-stack-umbrella"
  children:
    - sub-component-a
    - sub-component-b
```

### Monitoring hints (informational)

```yaml
monitoring:
  enabled: true
  prometheusJob: "my-app"
  alerts:
    - name: HighErrorRate
      severity: warning
      expression: 'rate(http_errors[5m]) > 0.01'
```

Platform doesn't evaluate rules, just surfaces them in the service detail.

### Custom (free-form key/value escape hatch)

```yaml
custom:
  costCenter: "CC-42"
  dataClass:  "pii"
  complianceTier: "gdpr-restricted"
```

Rendered as a key/value table on the service detail. Zero validation — use for
organization-specific annotations without schema changes.

## Helm template pattern

```yaml
# helmcharts/my-app/templates/itops-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ include "my-app.fullname" . }}-itops
  labels:
    itops.io/config: "true"
data:
  it-ops.yaml: |
    path: {{ .Values.itops.path | quote }}
    {{- with .Values.itops.slaGroup }}
    slaGroup: {{ . | quote }}
    {{- end }}
    {{- with .Values.itops.criticality }}
    criticality: {{ . | quote }}
    {{- end }}
    {{- with .Values.itops.backup }}
    backup:
      expected: {{ .expected | default false }}
      maxAgeDays: {{ .maxAgeDays | default 1 }}
    {{- end }}
    {{- with .Values.itops.tags }}
    tags: {{ . | toJson }}
    {{- end }}
```

## Parse errors (loud over silent)

If the YAML can't be parsed, the agent pushes a visible
`configmap-parse-error-<ns>-<name>` service under the owning cluster with
status=`UNKNOWN` and the parse error in the message. You see the misconfig in the UI,
not a silent absence.

## Field → UI surface

| Field | Where it shows up |
|---|---|
| `path` | Operations Catalog tree + hierarchy breadcrumb |
| `displayName` | Node label, SLA Group cards, service detail title |
| `criticality` | Badge on service cards + service detail |
| `slaGroup` | SLA Overview → Group cards (clickable) |
| `type` | Icon in Operations tree + meta row |
| `team` | Operations Catalog Team column + service detail |
| `tags` | Filter chips + Storage tab routing |
| `workload.*` | K8s Details section |
| `backup.*` | Backup tab entry + alert threshold |
| `health.enabled` | "HTTP health probe active" badge |
| `dependencies.requires` | "Depends on" clickable chips |
| `dependencies.usedBy` | "Used by" column (Storage tab) + chips |
| `links.*` | Service detail header buttons |
| `contacts.*` | Ownership section |
| `relations.*` | Relations section on service detail |
| `monitoring.*` | Service detail meta row |
| `custom.*` | Custom metadata key/value table on service detail |

## Repository layout (recommendation)

```
infra-gitops-repo/
├── platform/
│   ├── itops-values.yaml            # core + UI + postgres
│   ├── itops-agent-values.yaml      # per-cluster agent
│   └── sla-portal-values.yaml       # standalone status page
├── apps/                            # each service owns its folder
│   ├── my-app/
│   │   ├── values.yaml              # itops: { path, slaGroup, ... }
│   │   └── templates/
│   │       ├── deployment.yaml
│   │       └── itops-configmap.yaml
│   └── ...
└── monitoring/                      # push cronjobs for non-K8s services
    ├── health-push-galera.yaml
    ├── storage-push-s3.yaml
    └── backup-reporter-pg.yaml
```

Three ArgoCD Applications typically manage this: `itops-platform`, `apps`
(ApplicationSet), `monitoring`.
