Skip to content
APIANT
GuidePlatformv1

Diagnosis & Repair

Live-automation failures: pull the execution trace, identify the failure mode, propose or apply a fix.

When a live automation fails in production, Claude pulls the execution trace, cross-references the pattern library, identifies the failure mode, and proposes (or applies) a fix. The repair itself becomes a new signal: every fix extends the diagnostic vocabulary. See Diagnosing failed runs for the plugin walkthrough.

The flow

  1. Trigger — an alert fires, a user reports a problem, or an account-monitoring run surfaces an anomaly.
  2. Retrieve — pull the failing execution's full state — trigger data, action outputs, errors per step.
  3. Classify — match the failure against known patterns: auth expiration, rate limit, schema change, field drift, downstream outage.
  4. Propose — surface the diagnosis with a recommended fix (credential refresh, field remap, retry with backoff, skill patch).
  5. Apply — on approval, edit the automation in-place, optionally replay from the failing step using saved state.
  6. Record — the diagnosis and repair become training data for the next occurrence.

Why this works

Every step's real inputs, outputs, and decisions are preserved after a run, so any failing execution can be fully reconstructed — there is no "we don't know what happened." The trace, the inputs, the outputs, and the change history are all there to read.

Related docs

Last updated May 4, 2026