awoss Documentation

AWOSS-RUN sits at the moment between an agent's request and the real-world action that follows.

For tool calls, connector use, shell commands, outbound messages, data exports, sub-agent delegation, and other material changes, this family focuses on the checks before execution: what is allowed, what is denied, what must pause, who approves, and what receipt remains.

Good inventory and source review are not enough on their own. Many failures show up only when a live workflow asks to do something specific, so reviewers need a clear view of allowed, denied, paused, approved, interrupted, rolled-back, rate-limited, and recorded actions.

What This Family Covers

In scope:

High-impact action classes that agents can request or perform.
Tool calls, connector actions, shell or code execution, workflow invocations, external communications, data exports, access-control changes, and sub-agent delegation where applicable.
Human approval, runtime policy approval, denial by default, step-up authorization, budget limits, circuit breakers, allowlists, and equivalent runtime controls.
Whether the runtime can allow, deny, pause, request approval for, interrupt, roll back, contain, or record actions before and after execution.
Approval gates before high-impact production writes, broad shell execution, external communications, access-control changes, sensitive-data exports, and difficult-to-reverse changes.
Policy outcome records for allowed, denied, approval-required, approved, rejected, expired, canceled, interrupted, rolled-back, or rate-limited requests.
Emergency stop, session cancel, rollback, and containment procedures when agent activity leaves approved scope or matches known abuse patterns.
Pre-execution mediation for high-impact tool, connector, shell, workflow, external-service, and sub-agent actions.
Recurring or release-driven tests of approval gates, denied paths, allowlists, budget limits, circuit breakers, emergency stops, rollback procedures, and critical runtime policy decisions.
Stronger record handling for high-impact runtime decisions, such as tamper-evident, independently retained, or separation-controlled records.

Out of scope:

Deciding what resources belong inside the scoped system. That belongs mostly to AWOSS-SCP.
Defining whose authority the agent uses when acting. That belongs mostly to AWOSS-DEL.
Filesystem, repository, network, browser, sandbox, and endpoint boundaries. Those belong mostly to AWOSS-WSB.
Skill, tool, connector, plugin, or supplier provenance. That belongs mostly to AWOSS-SRC.
Prompt, memory, retrieval, and instruction-boundary controls. Those belong mostly to AWOSS-CTX.
Secret handling and sensitive-data policy by itself. That belongs mostly to AWOSS-SEC, though AWOSS-RUN can gate sensitive-data actions.
Complete log-retention, reconstruction, or tamper-evidence design. That belongs mostly to AWOSS-LOG, though this family needs runtime receipts.

Level Summary

Levels are cumulative. Level 2 builds on Level 1, and Level 3 builds on both.

Level	Plain-language meaning	Why this level exists	Typical evidence
Level 1	The organization knows which agent actions are high-impact, what controls should apply to them, and what the runtime can do before and after execution.	Runtime policy cannot be reviewed until high-impact actions, required controls, and runtime capabilities are named.	High-impact action taxonomy, runtime action policy, approval rule summary, mediation capability summary.
Level 2	Production workflows use approval gates and record policy outcomes for high-impact actions, with stop, cancel, rollback, or containment options where applicable.	Production agent activity needs repeatable decisions and reviewable outcomes, not only policy intent.	Approval workflow configuration, sampled approval receipts, denied-action logs, budget policy, emergency-stop or rollback procedure.
Level 3	High-impact actions are mediated before execution, runtime controls are tested, and high-impact decision records receive stronger protection or independent retention.	High-impact environments need assurance that runtime controls work before damage occurs and that critical records remain reviewable.	Pre-execution mediation configuration, recurring policy-path tests, blocked-action tests, rollback test, tamper-evident or independently retained decision records.

Candidate Controls

AWOSS-RUN-L1-001: High-Impact Action Taxonomy Level 1

Requirement summary

Identify high-impact action classes that agents can request or perform, including tool calls, connector actions, shell or code execution, workflow invocations, external communications, data-export actions, access-control changes, and sub-agent delegation where applicable.

Why it exists

Runtime controls need a vocabulary for risk. A tool call that summarizes a public note, a connector action that changes a customer record, a shell command that deletes files, and an external message that commits the business to a course of action should not all be treated the same way.

Why this level

This belongs at Level 1 because action classes must be named before the system can apply approval gates, denial rules, budgets, circuit breakers, or runtime receipts.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
High-impact action taxonomy	Organization or governance owner with runtime owner input	Before production use and after adding tools, connectors, workflows, or sub-agents	Action classes such as tool calls, connector actions, shell execution, external communications, data exports, access changes, and sub-agent delegation	Identifies action classes; does not prove they are detected or controlled correctly.
Tool and connector action inventory	Runtime platform owner	Before production use and after tool or connector changes	Available runtime actions, requested permissions, impact category, and owner	Supports action classification; does not prove source trust or least privilege by itself.

AWOSS-RUN-L1-002: Required Runtime Control Mapping Level 1

Requirement summary

Define which high-impact action classes require human approval, runtime policy approval, denial by default, step-up authorization, budget limits, circuit breakers, or a combination of these controls.

Why it exists

A list of high-impact actions is not enough. Reviewers need to know what should happen when each action is requested: allow, deny, request approval, require stronger authorization, enforce a budget, stop after a limit, or combine several controls.

Why this level

This belongs at Level 1 because it defines intended runtime behavior. Later levels expect production enforcement, outcome records, and tests.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Runtime action policy	Runtime platform owner with governance owner input	Before production use and after policy changes	Each high-impact action class and the expected control: approval, deny-by-default, step-up, budget, circuit breaker, allowlist, or combination	Shows intended policy; does not prove runtime enforcement.
Approval and control matrix	Organization or governance owner	Before production use and after action-class or approver changes	Which action classes require human approval, runtime approval, step-up authorization, or automatic denial	Supports governance review; does not prove approval gates are technically enforced.

AWOSS-RUN-L1-003: Runtime Mediation Capability Summary Level 1

Requirement summary

Identify whether the runtime can allow, deny, pause, request approval for, interrupt, roll back, or record tool and connector actions before and after execution.

Why it exists

Runtime policy depends on what the runtime can actually do. Some systems can mediate before execution, some only log after execution, and some can pause, interrupt, roll back, or contain a session. These differences affect the strength of any claim.

Why this level

This belongs at Level 1 because it documents the runtime's control surface. The requirement does not assume all capabilities exist; it requires that the available capabilities and gaps are visible.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Runtime mediation capability summary	Runtime platform owner	Before production use and after runtime changes	Whether the runtime can allow, deny, pause, request approval, interrupt, roll back, contain, and record actions before or after execution	Describes available capabilities; does not prove controls are used correctly.
Runtime architecture note	Runtime platform owner	Before review and after architecture changes	Where policy checks, approval prompts, pre-execution hooks, post-execution logs, and rollback or containment paths sit in the runtime flow	Supports architectural review; does not prove every path is mediated.

AWOSS-RUN-L2-001: High-Impact Approval Gate Level 2

Requirement summary

Require an approval gate before an agent performs a high-impact action that writes to production systems, executes shell commands with broad filesystem or operational impact, sends external communications, changes access controls, exports sensitive data, or commits irreversible or difficult-to-reverse changes.

Why it exists

Some actions should not run only because a prompt asked for them. Production writes, broad shell commands, external sends, access changes, sensitive data exports, and hard-to-reverse operations need a deliberate decision before execution.

Why this level

This belongs at Level 2 because managed production use should have an enforced or configured approval gate for high-impact actions, not only a written policy saying approval is expected.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Approval workflow configuration	Runtime platform owner	Before production use and after approval-rule changes	Triggering action classes, approver roles, approval prompts, expiry, fallback, and denial behavior	Supports review of approval-gate configuration; does not prove every trigger is complete.
Sampled approval receipt	Evidence or audit owner with runtime owner input	During operation and review sampling	Requested action, action class, requester, approver, timestamp, scope, decision, conditions, and execution outcome	Supports review of selected approvals; does not prove the action was safe or legally sufficient.

AWOSS-RUN-L2-002: Runtime Policy Outcome Records Level 2

Requirement summary

Record policy outcomes for high-impact action requests, including allowed, denied, approval required, approved, rejected, expired, canceled, interrupted, rolled back, or rate-limited outcomes where applicable.

Why it exists

Reviewers need to reconstruct what happened when a high-impact action was requested. A record that only shows final execution may miss denials, rejected approvals, expired requests, cancellation, interruption, rollback, or budget enforcement.

Why this level

This belongs at Level 2 because production accountability requires repeatable records of high-impact runtime decisions and outcomes.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Denied-action and policy-trigger log	Runtime platform owner or evidence owner	During operation and review sampling	Action request, policy decision, reason category, approval state, rate limit or budget state, interruption or rollback outcome where applicable	Supports runtime decision review; does not prove logs are complete or tamper-resistant.
Policy outcome export	Evidence or audit owner	During periodic review or after incidents	Counts and samples of allowed, denied, approval-required, approved, rejected, expired, canceled, interrupted, rolled-back, and rate-limited outcomes	Supports trend and sampling review; does not prove every action was appropriate.

AWOSS-RUN-L2-003: Stop, Cancel, Rollback, Or Containment Procedure Level 2

Requirement summary

Support emergency stop, session-cancel, rollback, or containment procedures for agent activity that deviates from approved scope, exceeds action budgets, violates allowlists, or matches known tool-abuse patterns.

Why it exists

Runtime control is not only about initial approval. If an agent starts doing something unexpected, exceeds the approved scope, hits a budget limit, uses an unapproved tool path, or matches a known abuse pattern, operators need a way to stop, cancel, roll back, or contain the activity.

Why this level

This is a Level 2 SHOULD because it is a practical production safety capability. Some environments may implement different stop or containment mechanisms, but the expected response path should be visible.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Emergency-stop or session-cancel procedure	Runtime platform owner with operations owner input	Before production use and after procedure changes	Trigger conditions, authorized operators, steps, expected effect, recovery path, and escalation	Shows intended response; does not prove the procedure works under load or across all tools.
Rollback or containment procedure	Runtime platform owner, workspace owner, or operations owner	Before high-impact use and after rollback changes	Rollback scope, containment steps, owner, approval path, limitations, and post-event review expectation	Supports reversibility review; does not prove every action can be reversed.
Budget or circuit-breaker policy	Runtime platform owner	Before production use and after limit changes	Budget limits, thresholds, circuit-breaker triggers, blocked actions, escalation, and reset conditions	Supports control review; does not prove limits cannot be bypassed.

AWOSS-RUN-L3-001: Pre-Execution Runtime Mediation Level 3

Requirement summary

Enforce runtime mediation before high-impact actions execute, including policy checks for tool, connector, shell, workflow, external-service, and sub-agent actions where applicable, rather than relying only on after-the-fact log review.

Why it exists

After-the-fact logging is not enough for high-impact actions. If the action is production-writing, externally sending, credential-changing, sensitive exporting, or hard to reverse, the control should run before execution where the runtime supports it.

Why this level

This belongs at Level 3 because pre-execution mediation is a stronger assurance expectation. It reduces reliance on later review and makes policy decisions part of the action path itself.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Pre-execution hook or mediation configuration	Runtime platform owner	Before high-impact production use and after mediation changes	Runtime checks applied before high-impact tool, connector, shell, workflow, external-service, or sub-agent actions execute	Supports mediation review; does not prove every possible execution path is covered.
Mediation path test record	Evidence or audit owner with runtime owner input	During validation or release-driven review	Test scenario, expected pre-execution check, policy decision, outcome, finding, and remediation if needed	Supports selected path validation; does not prove all bypasses are impossible.

AWOSS-RUN-L3-002: Runtime Control Test Cadence Level 3

Requirement summary

Test approval gates, denied-action paths, allowlists, budget limits, circuit breakers, emergency stops, rollback procedures, and critical policy decisions on a recurring or release-driven basis.

Why it exists

Runtime controls can drift as tools, connectors, prompts, workflows, and approval rules change. Testing should show that important policy paths still work after releases, configuration changes, and material scope expansion.

Why this level

This belongs at Level 3 because recurring or release-driven testing adds higher assurance for high-impact systems. It is stronger than relying on one-time configuration review or occasional manual inspection.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Runtime control test report	Evidence or audit owner with runtime owner input	Recurring, release-driven, and after material policy changes	Approval-gate tests, denied-action tests, allowlist tests, budget-limit tests, circuit-breaker tests, emergency-stop tests, rollback tests, results, and findings	Supports tested-path assurance; does not prove untested paths are safe.
Tool-call abuse or blocked-action test record	Evidence or audit owner	During validation, red-team, or release review	Abuse scenario, expected policy decision, actual result, log or receipt reference, and remediation status	Supports abuse-path review; does not prove absence of all tool misuse.

AWOSS-RUN-L3-003: Stronger High-Impact Decision Records Level 3

Requirement summary

Provide tamper-evident, independently retained, or separation-controlled records for high-impact runtime decisions.

Why it exists

High-impact runtime records may be needed for incident review, governance, assurance, or dispute resolution. If the same runtime that made a decision can silently alter or delete the record, the evidence may be weak for higher-assurance review.

Why this level

This is a Level 3 SHOULD because stronger record protection improves assurance for high-impact decisions but may require additional log, evidence, or audit infrastructure.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Tamper-evident or independently retained decision log	Evidence or audit owner	During operation and review periods	High-impact runtime decision metadata, retention location, integrity control or independent retention method, and access controls	Supports record-integrity review; does not prove decision correctness.
Separation-controlled evidence export	Evidence or audit owner	During periodic review, incident review, or release checkpoint	Export owner, runtime source, retention owner, access separation, time period, and sample of high-impact decisions	Supports independent review; does not prove all events were captured.

External Mapping Notes

The family-first crosswalk treats AWOSS-RUN as a candidate-control family shaped by runtime interception, authorization, action gates, step-up approval, budget limits, circuit breakers, allowlists, pre-execution hooks, action receipts, central policy decisions, and tool-invocation abuse scenarios.

Relevant source signals include:

EU AI Act signals around prohibited-practice risk reduction and transparency-trigger guardrails can inform restricted action classes and approval-gate tests, but do not establish legal compliance.
CSA AARM contributes the strongest direct signals around interception, authorization, step-up approval, and action receipts, but there is no AARM conformance without a real runtime implementation and assurance path.
OWASP AISVS contributes budget, circuit-breaker, action-gate, and MCP-check signals, but public AISVS v0.1 material does not prove AISVS conformance.
AIUC-1 contributes comparator signals around authorized scope restrictions, MCP allowlists, and pre-execution hooks, but there is no AIUC-1 certificate equivalence.
CSA AICM contributes orchestration-control context, but AARM remains the stronger reference for runtime action control.
Five Eyes agentic AI adoption guidance contributes least-privilege, just-in-time credential, central-policy, and human-approval signals.
MITRE ATLAS contributes tool-invocation, command-interpreter, and tool-mediated exfiltration abuse scenarios for testing and review.

Formal Standard Link

Use this guide with the formal AWOSS-RUN candidate requirements. If the guide and the standard draft disagree, the standard draft controls.