Docs

Alerts & cases

How SolarFleet decides something's wrong, opens a case for it, and tracks it through to resolution. The six rules, the severity model, and the SLA clock.

The shape of an alert

An alert is a signal that something on a site needs human attention. It comes from one of two places: the upstream inverter API raised an alarm, or one of our own derivation rules tripped on the underlying telemetry. Either way the alert lands in your Alerts tab and — if the rule is configured to escalate — opens a case automatically.

Every alert has a severity, a source (native vs derived), a start time, and a last-seen time. We don't fire and forget — if the underlying condition resolves on its own, the alert auto-resolves and any uncontested case it opened closes itself.

app.solarfleet.io / cases
Open alerts
String C — 12/14 optimisers offline
Optimiser drop · derived
Critical started 09:42
auto-opened case
Case #1042
SLA 3h 41m left
SY
System opened case from alert
09:42
EN
Engineer acknowledged · assigned to on-call
09:55
YO
You commented — “String breaker tripped, visit booked.”
10:18
A critical alert auto-opens a case; every system event, acknowledgement and comment lands in one chronological thread attached to the site. Illustrative data.

The six derived detections

Native alarms vary by inverter brand. We run six brand-agnostic detections on top, so a Solis fleet and a SolarEdge fleet get the same quality of fault coverage:

  1. Comms stale — the site stopped reporting. Two-tier: any 12-hour gap fires; daylight gaps over 4 hours fire; overnight lag from upstream APIs doesn't trigger a 3 AM email.
  2. Zero production — a site that should be generating (sun's up, weather supports it) is reporting zero.
  3. Low power — generation has dropped meaningfully against the weather-normalised expected curve for the sustained period.
  4. Inverter offline — one inverter in a multi-inverter site has dropped while the rest carry on.
  5. Optimiser drop — when per-optimiser data is available, sustained loss of channel reporting on a string.
  6. Partial offline — trend-based, multi-day. A fleet underperforming over consecutive days without a single catastrophic event. The slow-bleed faults that hide for weeks otherwise.
Why trend-based partial-offline. A peak-hour tick that's 30% below model often isn't a fault — it's a cloud. Earlier versions of the rule fired on single ticks and the inbox filled with noise. Now it needs sustained underperformance across multiple days before opening a case.

Severity model

Three levels, each with a colour and a UI behaviour:

  • Critical (red) — the site is materially under-generating or offline. Marks the site Critical in the portfolio view and the fleet map.
  • Warning (amber) — non-trivial but not catastrophic. Lands in the Watch lane.
  • Info (grey) — diagnostic context. Doesn't affect the site's status colour; useful in case history.

A site's portfolio status is the highest-severity open alert. Two warnings don't make a critical.

From alert to case

Rules can be configured to auto-open a case on first fire (the default for critical-severity alerts), or sit as an observation only. Cases have:

  • An SLA clock — time-to-acknowledge and time-to-resolve. Both are per-org configurable; the defaults are 4h ack / 24h resolve for critical, looser for warning.
  • An assignee — auto-assigned via the on-call matrix, or unassigned for triage.
  • A thread — every system event, every comment, every visit attached to the case sits in one chronological feed. The case stays attached to the site, not the engineer, so handover never costs you history.
  • A resolution note — what happened, what was done, what to watch for next time.

Auto-resolve

A case opened by an alert auto-closes if:

  • The underlying alert clears (production returns, comms come back).
  • No human has interacted with the case yet.
  • It's been less than 24 hours since opening.

Past that window — or as soon as a human writes a comment, assigns it, or attaches a visit — the case stays open until a human resolves it. The whole point is to clear noise without losing audit trail on real work.

Next up