Family Guides
AWOSS-RUN: Runtime Policy, Approvals, And Action Control
AWOSS-RUN sits at the moment between an agent's request and the real-world action that follows.
For tool calls, connector use, shell commands, outbound messages, data exports, sub-agent delegation, and other material changes, this family focuses on the checks before execution: what is allowed, what is denied, what must pause, who approves, and what receipt remains.
Good inventory and source review are not enough on their own. Many failures show up only when a live workflow asks to do something specific, so reviewers need a clear view of allowed, denied, paused, approved, interrupted, rolled-back, rate-limited, and recorded actions.
What This Family Covers
In scope:
- High-impact action classes that agents can request or perform.
- Tool calls, connector actions, shell or code execution, workflow invocations, external communications, data exports, access-control changes, and sub-agent delegation where applicable.
- Human approval, runtime policy approval, denial by default, step-up authorization, budget limits, circuit breakers, allowlists, and equivalent runtime controls.
- Whether the runtime can allow, deny, pause, request approval for, interrupt, roll back, contain, or record actions before and after execution.
- Approval gates before high-impact production writes, broad shell execution, external communications, access-control changes, sensitive-data exports, and difficult-to-reverse changes.
- Policy outcome records for allowed, denied, approval-required, approved, rejected, expired, canceled, interrupted, rolled-back, or rate-limited requests.
- Emergency stop, session cancel, rollback, and containment procedures when agent activity leaves approved scope or matches known abuse patterns.
- Pre-execution mediation for high-impact tool, connector, shell, workflow, external-service, and sub-agent actions.
- Recurring or release-driven tests of approval gates, denied paths, allowlists, budget limits, circuit breakers, emergency stops, rollback procedures, and critical runtime policy decisions.
- Stronger record handling for high-impact runtime decisions, such as tamper-evident, independently retained, or separation-controlled records.
Out of scope:
- Deciding what resources belong inside the scoped system. That belongs mostly to
AWOSS-SCP. - Defining whose authority the agent uses when acting. That belongs mostly to
AWOSS-DEL. - Filesystem, repository, network, browser, sandbox, and endpoint boundaries. Those belong mostly to
AWOSS-WSB. - Skill, tool, connector, plugin, or supplier provenance. That belongs mostly to
AWOSS-SRC. - Prompt, memory, retrieval, and instruction-boundary controls. Those belong mostly to
AWOSS-CTX. - Secret handling and sensitive-data policy by itself. That belongs mostly to
AWOSS-SEC, thoughAWOSS-RUNcan gate sensitive-data actions. - Complete log-retention, reconstruction, or tamper-evidence design. That belongs mostly to
AWOSS-LOG, though this family needs runtime receipts.
Level Summary
Levels are cumulative. Level 2 builds on Level 1, and Level 3 builds on both.
| Level | Plain-language meaning | Why this level exists | Typical evidence |
|---|---|---|---|
| Level 1 | The organization knows which agent actions are high-impact, what controls should apply to them, and what the runtime can do before and after execution. | Runtime policy cannot be reviewed until high-impact actions, required controls, and runtime capabilities are named. | High-impact action taxonomy, runtime action policy, approval rule summary, mediation capability summary. |
| Level 2 | Production workflows use approval gates and record policy outcomes for high-impact actions, with stop, cancel, rollback, or containment options where applicable. | Production agent activity needs repeatable decisions and reviewable outcomes, not only policy intent. | Approval workflow configuration, sampled approval receipts, denied-action logs, budget policy, emergency-stop or rollback procedure. |
| Level 3 | High-impact actions are mediated before execution, runtime controls are tested, and high-impact decision records receive stronger protection or independent retention. | High-impact environments need assurance that runtime controls work before damage occurs and that critical records remain reviewable. | Pre-execution mediation configuration, recurring policy-path tests, blocked-action tests, rollback test, tamper-evident or independently retained decision records. |
Candidate Controls
AWOSS-RUN-L1-001: High-Impact Action Taxonomy Level 1
Requirement summary
Identify high-impact action classes that agents can request or perform, including tool calls, connector actions, shell or code execution, workflow invocations, external communications, data-export actions, access-control changes, and sub-agent delegation where applicable.
Why it exists
Runtime controls need a vocabulary for risk. A tool call that summarizes a public note, a connector action that changes a customer record, a shell command that deletes files, and an external message that commits the business to a course of action should not all be treated the same way.
Why this level
This belongs at Level 1 because action classes must be named before the system can apply approval gates, denial rules, budgets, circuit breakers, or runtime receipts.
Evidence examples
| Evidence | Likely owner/provider | When collected | What it should show | Claim limit |
|---|---|---|---|---|
| High-impact action taxonomy | Organization or governance owner with runtime owner input | Before production use and after adding tools, connectors, workflows, or sub-agents | Action classes such as tool calls, connector actions, shell execution, external communications, data exports, access changes, and sub-agent delegation | Identifies action classes; does not prove they are detected or controlled correctly. |
| Tool and connector action inventory | Runtime platform owner | Before production use and after tool or connector changes | Available runtime actions, requested permissions, impact category, and owner | Supports action classification; does not prove source trust or least privilege by itself. |
AWOSS-RUN-L1-002: Required Runtime Control Mapping Level 1
Requirement summary
Define which high-impact action classes require human approval, runtime policy approval, denial by default, step-up authorization, budget limits, circuit breakers, or a combination of these controls.
Why it exists
A list of high-impact actions is not enough. Reviewers need to know what should happen when each action is requested: allow, deny, request approval, require stronger authorization, enforce a budget, stop after a limit, or combine several controls.
Why this level
This belongs at Level 1 because it defines intended runtime behavior. Later levels expect production enforcement, outcome records, and tests.
Evidence examples
| Evidence | Likely owner/provider | When collected | What it should show | Claim limit |
|---|---|---|---|---|
| Runtime action policy | Runtime platform owner with governance owner input | Before production use and after policy changes | Each high-impact action class and the expected control: approval, deny-by-default, step-up, budget, circuit breaker, allowlist, or combination | Shows intended policy; does not prove runtime enforcement. |
| Approval and control matrix | Organization or governance owner | Before production use and after action-class or approver changes | Which action classes require human approval, runtime approval, step-up authorization, or automatic denial | Supports governance review; does not prove approval gates are technically enforced. |
AWOSS-RUN-L1-003: Runtime Mediation Capability Summary Level 1
Requirement summary
Identify whether the runtime can allow, deny, pause, request approval for, interrupt, roll back, or record tool and connector actions before and after execution.
Why it exists
Runtime policy depends on what the runtime can actually do. Some systems can mediate before execution, some only log after execution, and some can pause, interrupt, roll back, or contain a session. These differences affect the strength of any claim.
Why this level
This belongs at Level 1 because it documents the runtime's control surface. The requirement does not assume all capabilities exist; it requires that the available capabilities and gaps are visible.
Evidence examples
| Evidence | Likely owner/provider | When collected | What it should show | Claim limit |
|---|---|---|---|---|
| Runtime mediation capability summary | Runtime platform owner | Before production use and after runtime changes | Whether the runtime can allow, deny, pause, request approval, interrupt, roll back, contain, and record actions before or after execution | Describes available capabilities; does not prove controls are used correctly. |
| Runtime architecture note | Runtime platform owner | Before review and after architecture changes | Where policy checks, approval prompts, pre-execution hooks, post-execution logs, and rollback or containment paths sit in the runtime flow | Supports architectural review; does not prove every path is mediated. |
AWOSS-RUN-L2-001: High-Impact Approval Gate Level 2
Requirement summary
Require an approval gate before an agent performs a high-impact action that writes to production systems, executes shell commands with broad filesystem or operational impact, sends external communications, changes access controls, exports sensitive data, or commits irreversible or difficult-to-reverse changes.
Why it exists
Some actions should not run only because a prompt asked for them. Production writes, broad shell commands, external sends, access changes, sensitive data exports, and hard-to-reverse operations need a deliberate decision before execution.
Why this level
This belongs at Level 2 because managed production use should have an enforced or configured approval gate for high-impact actions, not only a written policy saying approval is expected.
Evidence examples
| Evidence | Likely owner/provider | When collected | What it should show | Claim limit |
|---|---|---|---|---|
| Approval workflow configuration | Runtime platform owner | Before production use and after approval-rule changes | Triggering action classes, approver roles, approval prompts, expiry, fallback, and denial behavior | Supports review of approval-gate configuration; does not prove every trigger is complete. |
| Sampled approval receipt | Evidence or audit owner with runtime owner input | During operation and review sampling | Requested action, action class, requester, approver, timestamp, scope, decision, conditions, and execution outcome | Supports review of selected approvals; does not prove the action was safe or legally sufficient. |
AWOSS-RUN-L2-002: Runtime Policy Outcome Records Level 2
Requirement summary
Record policy outcomes for high-impact action requests, including allowed, denied, approval required, approved, rejected, expired, canceled, interrupted, rolled back, or rate-limited outcomes where applicable.
Why it exists
Reviewers need to reconstruct what happened when a high-impact action was requested. A record that only shows final execution may miss denials, rejected approvals, expired requests, cancellation, interruption, rollback, or budget enforcement.
Why this level
This belongs at Level 2 because production accountability requires repeatable records of high-impact runtime decisions and outcomes.
Evidence examples
| Evidence | Likely owner/provider | When collected | What it should show | Claim limit |
|---|---|---|---|---|
| Denied-action and policy-trigger log | Runtime platform owner or evidence owner | During operation and review sampling | Action request, policy decision, reason category, approval state, rate limit or budget state, interruption or rollback outcome where applicable | Supports runtime decision review; does not prove logs are complete or tamper-resistant. |
| Policy outcome export | Evidence or audit owner | During periodic review or after incidents | Counts and samples of allowed, denied, approval-required, approved, rejected, expired, canceled, interrupted, rolled-back, and rate-limited outcomes | Supports trend and sampling review; does not prove every action was appropriate. |
AWOSS-RUN-L2-003: Stop, Cancel, Rollback, Or Containment Procedure Level 2
Requirement summary
Support emergency stop, session-cancel, rollback, or containment procedures for agent activity that deviates from approved scope, exceeds action budgets, violates allowlists, or matches known tool-abuse patterns.
Why it exists
Runtime control is not only about initial approval. If an agent starts doing something unexpected, exceeds the approved scope, hits a budget limit, uses an unapproved tool path, or matches a known abuse pattern, operators need a way to stop, cancel, roll back, or contain the activity.
Why this level
This is a Level 2 SHOULD because it is a practical production safety capability. Some environments may implement different stop or containment mechanisms, but the expected response path should be visible.
Evidence examples
| Evidence | Likely owner/provider | When collected | What it should show | Claim limit |
|---|---|---|---|---|
| Emergency-stop or session-cancel procedure | Runtime platform owner with operations owner input | Before production use and after procedure changes | Trigger conditions, authorized operators, steps, expected effect, recovery path, and escalation | Shows intended response; does not prove the procedure works under load or across all tools. |
| Rollback or containment procedure | Runtime platform owner, workspace owner, or operations owner | Before high-impact use and after rollback changes | Rollback scope, containment steps, owner, approval path, limitations, and post-event review expectation | Supports reversibility review; does not prove every action can be reversed. |
| Budget or circuit-breaker policy | Runtime platform owner | Before production use and after limit changes | Budget limits, thresholds, circuit-breaker triggers, blocked actions, escalation, and reset conditions | Supports control review; does not prove limits cannot be bypassed. |
AWOSS-RUN-L3-001: Pre-Execution Runtime Mediation Level 3
Requirement summary
Enforce runtime mediation before high-impact actions execute, including policy checks for tool, connector, shell, workflow, external-service, and sub-agent actions where applicable, rather than relying only on after-the-fact log review.
Why it exists
After-the-fact logging is not enough for high-impact actions. If the action is production-writing, externally sending, credential-changing, sensitive exporting, or hard to reverse, the control should run before execution where the runtime supports it.
Why this level
This belongs at Level 3 because pre-execution mediation is a stronger assurance expectation. It reduces reliance on later review and makes policy decisions part of the action path itself.
Evidence examples
| Evidence | Likely owner/provider | When collected | What it should show | Claim limit |
|---|---|---|---|---|
| Pre-execution hook or mediation configuration | Runtime platform owner | Before high-impact production use and after mediation changes | Runtime checks applied before high-impact tool, connector, shell, workflow, external-service, or sub-agent actions execute | Supports mediation review; does not prove every possible execution path is covered. |
| Mediation path test record | Evidence or audit owner with runtime owner input | During validation or release-driven review | Test scenario, expected pre-execution check, policy decision, outcome, finding, and remediation if needed | Supports selected path validation; does not prove all bypasses are impossible. |
AWOSS-RUN-L3-002: Runtime Control Test Cadence Level 3
Requirement summary
Test approval gates, denied-action paths, allowlists, budget limits, circuit breakers, emergency stops, rollback procedures, and critical policy decisions on a recurring or release-driven basis.
Why it exists
Runtime controls can drift as tools, connectors, prompts, workflows, and approval rules change. Testing should show that important policy paths still work after releases, configuration changes, and material scope expansion.
Why this level
This belongs at Level 3 because recurring or release-driven testing adds higher assurance for high-impact systems. It is stronger than relying on one-time configuration review or occasional manual inspection.
Evidence examples
| Evidence | Likely owner/provider | When collected | What it should show | Claim limit |
|---|---|---|---|---|
| Runtime control test report | Evidence or audit owner with runtime owner input | Recurring, release-driven, and after material policy changes | Approval-gate tests, denied-action tests, allowlist tests, budget-limit tests, circuit-breaker tests, emergency-stop tests, rollback tests, results, and findings | Supports tested-path assurance; does not prove untested paths are safe. |
| Tool-call abuse or blocked-action test record | Evidence or audit owner | During validation, red-team, or release review | Abuse scenario, expected policy decision, actual result, log or receipt reference, and remediation status | Supports abuse-path review; does not prove absence of all tool misuse. |
AWOSS-RUN-L3-003: Stronger High-Impact Decision Records Level 3
Requirement summary
Provide tamper-evident, independently retained, or separation-controlled records for high-impact runtime decisions.
Why it exists
High-impact runtime records may be needed for incident review, governance, assurance, or dispute resolution. If the same runtime that made a decision can silently alter or delete the record, the evidence may be weak for higher-assurance review.
Why this level
This is a Level 3 SHOULD because stronger record protection improves assurance for high-impact decisions but may require additional log, evidence, or audit infrastructure.
Evidence examples
| Evidence | Likely owner/provider | When collected | What it should show | Claim limit |
|---|---|---|---|---|
| Tamper-evident or independently retained decision log | Evidence or audit owner | During operation and review periods | High-impact runtime decision metadata, retention location, integrity control or independent retention method, and access controls | Supports record-integrity review; does not prove decision correctness. |
| Separation-controlled evidence export | Evidence or audit owner | During periodic review, incident review, or release checkpoint | Export owner, runtime source, retention owner, access separation, time period, and sample of high-impact decisions | Supports independent review; does not prove all events were captured. |
External Mapping Notes
The family-first crosswalk treats AWOSS-RUN as a candidate-control family shaped by runtime interception, authorization, action gates, step-up approval, budget limits, circuit breakers, allowlists, pre-execution hooks, action receipts, central policy decisions, and tool-invocation abuse scenarios.
Relevant source signals include:
- EU AI Act signals around prohibited-practice risk reduction and transparency-trigger guardrails can inform restricted action classes and approval-gate tests, but do not establish legal compliance.
- CSA AARM contributes the strongest direct signals around interception, authorization, step-up approval, and action receipts, but there is no AARM conformance without a real runtime implementation and assurance path.
- OWASP AISVS contributes budget, circuit-breaker, action-gate, and MCP-check signals, but public AISVS
v0.1material does not prove AISVS conformance. - AIUC-1 contributes comparator signals around authorized scope restrictions, MCP allowlists, and pre-execution hooks, but there is no AIUC-1 certificate equivalence.
- CSA AICM contributes orchestration-control context, but AARM remains the stronger reference for runtime action control.
- Five Eyes agentic AI adoption guidance contributes least-privilege, just-in-time credential, central-policy, and human-approval signals.
- MITRE ATLAS contributes tool-invocation, command-interpreter, and tool-mediated exfiltration abuse scenarios for testing and review.
Formal Standard Link
Use this guide with the formal AWOSS-RUN candidate requirements. If the guide and the standard draft disagree, the standard draft controls.