Family Guides

AWOSS-RUN: Runtime Policy, Approvals, And Action Control

AWOSS-RUN sits at the moment between an agent's request and the real-world action that follows.

For tool calls, connector use, shell commands, outbound messages, data exports, sub-agent delegation, and other material changes, this family focuses on the checks before execution: what is allowed, what is denied, what must pause, who approves, and what receipt remains.

Good inventory and source review are not enough on their own. Many failures show up only when a live workflow asks to do something specific, so reviewers need a clear view of allowed, denied, paused, approved, interrupted, rolled-back, rate-limited, and recorded actions.

What This Family Covers

In scope:

  • High-impact action classes that agents can request or perform.
  • Tool calls, connector actions, shell or code execution, workflow invocations, external communications, data exports, access-control changes, and sub-agent delegation where applicable.
  • Human approval, runtime policy approval, denial by default, step-up authorization, budget limits, circuit breakers, allowlists, and equivalent runtime controls.
  • Whether the runtime can allow, deny, pause, request approval for, interrupt, roll back, contain, or record actions before and after execution.
  • Approval gates before high-impact production writes, broad shell execution, external communications, access-control changes, sensitive-data exports, and difficult-to-reverse changes.
  • Policy outcome records for allowed, denied, approval-required, approved, rejected, expired, canceled, interrupted, rolled-back, or rate-limited requests.
  • Emergency stop, session cancel, rollback, and containment procedures when agent activity leaves approved scope or matches known abuse patterns.
  • Pre-execution mediation for high-impact tool, connector, shell, workflow, external-service, and sub-agent actions.
  • Recurring or release-driven tests of approval gates, denied paths, allowlists, budget limits, circuit breakers, emergency stops, rollback procedures, and critical runtime policy decisions.
  • Stronger record handling for high-impact runtime decisions, such as tamper-evident, independently retained, or separation-controlled records.

Out of scope:

  • Deciding what resources belong inside the scoped system. That belongs mostly to AWOSS-SCP.
  • Defining whose authority the agent uses when acting. That belongs mostly to AWOSS-DEL.
  • Filesystem, repository, network, browser, sandbox, and endpoint boundaries. Those belong mostly to AWOSS-WSB.
  • Skill, tool, connector, plugin, or supplier provenance. That belongs mostly to AWOSS-SRC.
  • Prompt, memory, retrieval, and instruction-boundary controls. Those belong mostly to AWOSS-CTX.
  • Secret handling and sensitive-data policy by itself. That belongs mostly to AWOSS-SEC, though AWOSS-RUN can gate sensitive-data actions.
  • Complete log-retention, reconstruction, or tamper-evidence design. That belongs mostly to AWOSS-LOG, though this family needs runtime receipts.

Level Summary

Levels are cumulative. Level 2 builds on Level 1, and Level 3 builds on both.

LevelPlain-language meaningWhy this level existsTypical evidence
Level 1The organization knows which agent actions are high-impact, what controls should apply to them, and what the runtime can do before and after execution.Runtime policy cannot be reviewed until high-impact actions, required controls, and runtime capabilities are named.High-impact action taxonomy, runtime action policy, approval rule summary, mediation capability summary.
Level 2Production workflows use approval gates and record policy outcomes for high-impact actions, with stop, cancel, rollback, or containment options where applicable.Production agent activity needs repeatable decisions and reviewable outcomes, not only policy intent.Approval workflow configuration, sampled approval receipts, denied-action logs, budget policy, emergency-stop or rollback procedure.
Level 3High-impact actions are mediated before execution, runtime controls are tested, and high-impact decision records receive stronger protection or independent retention.High-impact environments need assurance that runtime controls work before damage occurs and that critical records remain reviewable.Pre-execution mediation configuration, recurring policy-path tests, blocked-action tests, rollback test, tamper-evident or independently retained decision records.

Candidate Controls

AWOSS-RUN-L1-001: High-Impact Action Taxonomy Level 1

Requirement summary

Identify high-impact action classes that agents can request or perform, including tool calls, connector actions, shell or code execution, workflow invocations, external communications, data-export actions, access-control changes, and sub-agent delegation where applicable.

Why it exists

Runtime controls need a vocabulary for risk. A tool call that summarizes a public note, a connector action that changes a customer record, a shell command that deletes files, and an external message that commits the business to a course of action should not all be treated the same way.

Why this level

This belongs at Level 1 because action classes must be named before the system can apply approval gates, denial rules, budgets, circuit breakers, or runtime receipts.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
High-impact action taxonomyOrganization or governance owner with runtime owner inputBefore production use and after adding tools, connectors, workflows, or sub-agentsAction classes such as tool calls, connector actions, shell execution, external communications, data exports, access changes, and sub-agent delegationIdentifies action classes; does not prove they are detected or controlled correctly.
Tool and connector action inventoryRuntime platform ownerBefore production use and after tool or connector changesAvailable runtime actions, requested permissions, impact category, and ownerSupports action classification; does not prove source trust or least privilege by itself.

AWOSS-RUN-L1-002: Required Runtime Control Mapping Level 1

Requirement summary

Define which high-impact action classes require human approval, runtime policy approval, denial by default, step-up authorization, budget limits, circuit breakers, or a combination of these controls.

Why it exists

A list of high-impact actions is not enough. Reviewers need to know what should happen when each action is requested: allow, deny, request approval, require stronger authorization, enforce a budget, stop after a limit, or combine several controls.

Why this level

This belongs at Level 1 because it defines intended runtime behavior. Later levels expect production enforcement, outcome records, and tests.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Runtime action policyRuntime platform owner with governance owner inputBefore production use and after policy changesEach high-impact action class and the expected control: approval, deny-by-default, step-up, budget, circuit breaker, allowlist, or combinationShows intended policy; does not prove runtime enforcement.
Approval and control matrixOrganization or governance ownerBefore production use and after action-class or approver changesWhich action classes require human approval, runtime approval, step-up authorization, or automatic denialSupports governance review; does not prove approval gates are technically enforced.

AWOSS-RUN-L1-003: Runtime Mediation Capability Summary Level 1

Requirement summary

Identify whether the runtime can allow, deny, pause, request approval for, interrupt, roll back, or record tool and connector actions before and after execution.

Why it exists

Runtime policy depends on what the runtime can actually do. Some systems can mediate before execution, some only log after execution, and some can pause, interrupt, roll back, or contain a session. These differences affect the strength of any claim.

Why this level

This belongs at Level 1 because it documents the runtime's control surface. The requirement does not assume all capabilities exist; it requires that the available capabilities and gaps are visible.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Runtime mediation capability summaryRuntime platform ownerBefore production use and after runtime changesWhether the runtime can allow, deny, pause, request approval, interrupt, roll back, contain, and record actions before or after executionDescribes available capabilities; does not prove controls are used correctly.
Runtime architecture noteRuntime platform ownerBefore review and after architecture changesWhere policy checks, approval prompts, pre-execution hooks, post-execution logs, and rollback or containment paths sit in the runtime flowSupports architectural review; does not prove every path is mediated.

AWOSS-RUN-L2-001: High-Impact Approval Gate Level 2

Requirement summary

Require an approval gate before an agent performs a high-impact action that writes to production systems, executes shell commands with broad filesystem or operational impact, sends external communications, changes access controls, exports sensitive data, or commits irreversible or difficult-to-reverse changes.

Why it exists

Some actions should not run only because a prompt asked for them. Production writes, broad shell commands, external sends, access changes, sensitive data exports, and hard-to-reverse operations need a deliberate decision before execution.

Why this level

This belongs at Level 2 because managed production use should have an enforced or configured approval gate for high-impact actions, not only a written policy saying approval is expected.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Approval workflow configurationRuntime platform ownerBefore production use and after approval-rule changesTriggering action classes, approver roles, approval prompts, expiry, fallback, and denial behaviorSupports review of approval-gate configuration; does not prove every trigger is complete.
Sampled approval receiptEvidence or audit owner with runtime owner inputDuring operation and review samplingRequested action, action class, requester, approver, timestamp, scope, decision, conditions, and execution outcomeSupports review of selected approvals; does not prove the action was safe or legally sufficient.

AWOSS-RUN-L2-002: Runtime Policy Outcome Records Level 2

Requirement summary

Record policy outcomes for high-impact action requests, including allowed, denied, approval required, approved, rejected, expired, canceled, interrupted, rolled back, or rate-limited outcomes where applicable.

Why it exists

Reviewers need to reconstruct what happened when a high-impact action was requested. A record that only shows final execution may miss denials, rejected approvals, expired requests, cancellation, interruption, rollback, or budget enforcement.

Why this level

This belongs at Level 2 because production accountability requires repeatable records of high-impact runtime decisions and outcomes.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Denied-action and policy-trigger logRuntime platform owner or evidence ownerDuring operation and review samplingAction request, policy decision, reason category, approval state, rate limit or budget state, interruption or rollback outcome where applicableSupports runtime decision review; does not prove logs are complete or tamper-resistant.
Policy outcome exportEvidence or audit ownerDuring periodic review or after incidentsCounts and samples of allowed, denied, approval-required, approved, rejected, expired, canceled, interrupted, rolled-back, and rate-limited outcomesSupports trend and sampling review; does not prove every action was appropriate.

AWOSS-RUN-L2-003: Stop, Cancel, Rollback, Or Containment Procedure Level 2

Requirement summary

Support emergency stop, session-cancel, rollback, or containment procedures for agent activity that deviates from approved scope, exceeds action budgets, violates allowlists, or matches known tool-abuse patterns.

Why it exists

Runtime control is not only about initial approval. If an agent starts doing something unexpected, exceeds the approved scope, hits a budget limit, uses an unapproved tool path, or matches a known abuse pattern, operators need a way to stop, cancel, roll back, or contain the activity.

Why this level

This is a Level 2 SHOULD because it is a practical production safety capability. Some environments may implement different stop or containment mechanisms, but the expected response path should be visible.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Emergency-stop or session-cancel procedureRuntime platform owner with operations owner inputBefore production use and after procedure changesTrigger conditions, authorized operators, steps, expected effect, recovery path, and escalationShows intended response; does not prove the procedure works under load or across all tools.
Rollback or containment procedureRuntime platform owner, workspace owner, or operations ownerBefore high-impact use and after rollback changesRollback scope, containment steps, owner, approval path, limitations, and post-event review expectationSupports reversibility review; does not prove every action can be reversed.
Budget or circuit-breaker policyRuntime platform ownerBefore production use and after limit changesBudget limits, thresholds, circuit-breaker triggers, blocked actions, escalation, and reset conditionsSupports control review; does not prove limits cannot be bypassed.

AWOSS-RUN-L3-001: Pre-Execution Runtime Mediation Level 3

Requirement summary

Enforce runtime mediation before high-impact actions execute, including policy checks for tool, connector, shell, workflow, external-service, and sub-agent actions where applicable, rather than relying only on after-the-fact log review.

Why it exists

After-the-fact logging is not enough for high-impact actions. If the action is production-writing, externally sending, credential-changing, sensitive exporting, or hard to reverse, the control should run before execution where the runtime supports it.

Why this level

This belongs at Level 3 because pre-execution mediation is a stronger assurance expectation. It reduces reliance on later review and makes policy decisions part of the action path itself.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Pre-execution hook or mediation configurationRuntime platform ownerBefore high-impact production use and after mediation changesRuntime checks applied before high-impact tool, connector, shell, workflow, external-service, or sub-agent actions executeSupports mediation review; does not prove every possible execution path is covered.
Mediation path test recordEvidence or audit owner with runtime owner inputDuring validation or release-driven reviewTest scenario, expected pre-execution check, policy decision, outcome, finding, and remediation if neededSupports selected path validation; does not prove all bypasses are impossible.

AWOSS-RUN-L3-002: Runtime Control Test Cadence Level 3

Requirement summary

Test approval gates, denied-action paths, allowlists, budget limits, circuit breakers, emergency stops, rollback procedures, and critical policy decisions on a recurring or release-driven basis.

Why it exists

Runtime controls can drift as tools, connectors, prompts, workflows, and approval rules change. Testing should show that important policy paths still work after releases, configuration changes, and material scope expansion.

Why this level

This belongs at Level 3 because recurring or release-driven testing adds higher assurance for high-impact systems. It is stronger than relying on one-time configuration review or occasional manual inspection.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Runtime control test reportEvidence or audit owner with runtime owner inputRecurring, release-driven, and after material policy changesApproval-gate tests, denied-action tests, allowlist tests, budget-limit tests, circuit-breaker tests, emergency-stop tests, rollback tests, results, and findingsSupports tested-path assurance; does not prove untested paths are safe.
Tool-call abuse or blocked-action test recordEvidence or audit ownerDuring validation, red-team, or release reviewAbuse scenario, expected policy decision, actual result, log or receipt reference, and remediation statusSupports abuse-path review; does not prove absence of all tool misuse.

AWOSS-RUN-L3-003: Stronger High-Impact Decision Records Level 3

Requirement summary

Provide tamper-evident, independently retained, or separation-controlled records for high-impact runtime decisions.

Why it exists

High-impact runtime records may be needed for incident review, governance, assurance, or dispute resolution. If the same runtime that made a decision can silently alter or delete the record, the evidence may be weak for higher-assurance review.

Why this level

This is a Level 3 SHOULD because stronger record protection improves assurance for high-impact decisions but may require additional log, evidence, or audit infrastructure.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Tamper-evident or independently retained decision logEvidence or audit ownerDuring operation and review periodsHigh-impact runtime decision metadata, retention location, integrity control or independent retention method, and access controlsSupports record-integrity review; does not prove decision correctness.
Separation-controlled evidence exportEvidence or audit ownerDuring periodic review, incident review, or release checkpointExport owner, runtime source, retention owner, access separation, time period, and sample of high-impact decisionsSupports independent review; does not prove all events were captured.

External Mapping Notes

The family-first crosswalk treats AWOSS-RUN as a candidate-control family shaped by runtime interception, authorization, action gates, step-up approval, budget limits, circuit breakers, allowlists, pre-execution hooks, action receipts, central policy decisions, and tool-invocation abuse scenarios.

Relevant source signals include:

  • EU AI Act signals around prohibited-practice risk reduction and transparency-trigger guardrails can inform restricted action classes and approval-gate tests, but do not establish legal compliance.
  • CSA AARM contributes the strongest direct signals around interception, authorization, step-up approval, and action receipts, but there is no AARM conformance without a real runtime implementation and assurance path.
  • OWASP AISVS contributes budget, circuit-breaker, action-gate, and MCP-check signals, but public AISVS v0.1 material does not prove AISVS conformance.
  • AIUC-1 contributes comparator signals around authorized scope restrictions, MCP allowlists, and pre-execution hooks, but there is no AIUC-1 certificate equivalence.
  • CSA AICM contributes orchestration-control context, but AARM remains the stronger reference for runtime action control.
  • Five Eyes agentic AI adoption guidance contributes least-privilege, just-in-time credential, central-policy, and human-approval signals.
  • MITRE ATLAS contributes tool-invocation, command-interpreter, and tool-mediated exfiltration abuse scenarios for testing and review.

Use this guide with the formal AWOSS-RUN candidate requirements. If the guide and the standard draft disagree, the standard draft controls.