Introduction
Why Agentic Work Needs Its Own Workspace Security Profile
AI agents turn ordinary business workflows into cross-boundary systems. This deep dive explains why agentic work needs a scoped workspace security profile around the real workflow: reach, authority, action gates, source trust, context, evidence, and claim boundaries.
The Agent Is No Longer Just Chat
AI agent security is no longer only a question about what happens inside a chat box.
The practical risk starts to change when an agent can work inside the same environment where people do their jobs. A business agent may read project documents, search a shared drive, open a repository, run a shell command, inspect a spreadsheet, draft a customer response, query a ticketing system, use a browser session, call a SaaS connector, update local files, or hand work to another tool. None of those actions has to look dramatic on its own. The security problem comes from the fact that they can happen together, inside one workflow, under some mix of human delegation, runtime policy, workspace permissions, context, memory, and approval rules.
That is a different review problem from asking whether a model produced an acceptable answer. A model review can tell you something about the model or application. It does not, by itself, explain which files the surrounding workspace allowed the agent to read, which commands it could run, which connectors it could call, which identity or account it acted through, which context sources were allowed to steer it, or which evidence remains after the work is done.
Consider a simple internal workflow. A team asks an agent to help prepare a release note. The agent reads planning docs, checks a repository, runs tests, summarizes changed behavior, drafts a ticket update, and prepares a message that a person must approve before sending. The visible output may just be a draft paragraph. The underlying workspace activity is broader: documents were retrieved, source files were inspected, code execution may have happened, a communication draft was prepared, and a human approval boundary mattered.
For a security or risk reviewer, the core question is not only "was the final text good?" It is also:
- What exactly was the agent allowed to reach?
- Whose authority did it use?
- Which actions could create business impact?
- What had to pause for approval?
- Which sources and context could influence the work?
- What logs, receipts, or evidence can reconstruct the action path?
- Which gaps or unsupported paths block stronger claims?
Those questions are about the workspace around the agent. They sit between model safety, endpoint security, identity, application controls, source trust, data handling, logging, validation, and governance. Treating any one of those surfaces as the whole answer tends to lose the shape of the actual workflow.
This is the reason a workspace security profile is useful. A profile can name one bounded agentic workflow, draw its resource boundary, identify owners and authority paths, describe runtime controls, list the tools and sources that can steer the agent, define what evidence should exist, and record what the evidence does not support. The profile does not need to pretend that the whole organization has been solved. It starts with one reviewable boundary.
That is the problem awoss, the Agentic Workspace Security Standard, is exploring as a working draft: how to make agentic work in business workspaces reviewable as a scoped system, with evidence and claim limits that are as important as the controls themselves.
Why Existing Buckets Are Not Enough By Themselves
The existing buckets are useful. They are just not the whole review unit.
AI governance policies, model reviews, application security checks, endpoint controls, identity systems, logging platforms, legal reviews, and management systems each answer part of the problem. A team should not throw them away because agents have arrived. The more practical issue is that agentic work joins those domains inside one workflow.
An agent can inherit a user's workspace permissions, call a connector, read retrieved context, use a tool selected by a manifest, execute a local command, draft an external message, and leave only partial logs behind. That path is not only a model-risk question. It is not only an application-security question. It is not only an endpoint, identity, privacy, or legal question either. It is a boundary question: what system did the organization actually allow to act, and what evidence can show whether that action path was scoped, approved, denied, logged, redacted, reviewed, or excluded?
Existing guidance gives strong anchors for that review. OWASP material helps teams reason about AI application requirements, agentic action, MCP and tool risks, skill-layer source trust, monitoring, human oversight, and testing. CSA AARM is directly relevant when the question is runtime interception, policy evaluation, action decisions, and receipts before action execution. NIST's AI RMF and AI Agent Standards Initiative give risk-management, interoperability, identity, authorization, and ecosystem direction. ISO/IEC 42001 gives organizations an AI management-system frame. MITRE ATLAS helps threat-model adversarial behavior against AI-enabled systems. AIUC-1 is a useful market-facing comparator for AI agent scoping and evidence-model thinking. Government and legal frameworks, including the EU AI Act, can impose real obligations that a workspace security profile must not pretend to satisfy by itself.
Those sources are complements, not competitors. The gap appears when a reviewer tries to stitch them together for one ordinary agentic workspace.
A governance policy may say who is responsible for AI use, but it may not show which local files, repositories, SaaS connectors, shell commands, or memory stores the agent could reach during a task. A model or application review may assess prompts, outputs, evaluations, or deployment architecture, but still miss delegated authority inside a user's real workspace. Endpoint controls and identity controls can restrict accounts, devices, processes, and sessions, but they may not explain which retrieved documents or installed skills were trusted enough to steer the agent. Logs and transcripts may show what the user saw or what the model answered, while leaving unclear whether a high-impact action was allowed by policy, paused for approval, denied, rolled back, or routed through an unsupported path.
That is why the phrase "by themselves" matters. Existing buckets can be strong inside their own scope and still leave the agentic-workspace boundary hard to review. A team can have an AI policy, an approved model, endpoint management, single sign-on, ticket logs, and legal guidance, yet still struggle to answer a simple operational question: for this agent-assisted workflow, what could the agent observe, invoke, change, send, retain, and prove?
awoss should complement the existing landscape by making that scoped boundary reviewable. The candidate profile should ask for the connective evidence that falls between the buckets: a named workflow, in-scope and out-of-scope resources, authority and approval paths, source and context inventories, runtime policy, action receipts, sensitive-data handling, validation findings, exceptions, and claim limits.
That does not make awoss a replacement for OWASP, CSA, NIST, ISO, MITRE, AIUC-1, government guidance, legal obligations, privacy duties, security programs, or management systems. It makes awoss a possible way to organize the agentic workspace question before stronger claims are made. If the evidence does not show runtime receipts, the profile should say so. If legal classification is unresolved, the profile should say so. If a source, context path, connector, or local tool is outside the review boundary, the profile should say so. If a management system or certification program is needed for a separate claim, the profile should point to that dependency rather than absorb it.
The practical goal is modest and useful: keep each existing bucket doing its job, then add a reviewable workspace profile around the agentic workflow that crosses them.
The Missing Unit: A Scoped Agentic Workspace System
The missing review unit is not the model. It is not the prompt. It is not one tool, one connector, one skill, one laptop, one SaaS application, or one runtime product.
The useful unit is the scoped agentic workspace system.
That phrase is intentionally narrow and wide at the same time. It is wider than one agent component, because a real workflow can depend on the runtime, workspace permissions, local files, repositories, SaaS connectors, shell access, identity, approval paths, retrieval sources, memory behavior, logs, and governance decisions. It is narrower than "all AI use in the company," because no first review should pretend to cover every chatbot, experiment, vendor feature, workflow, team, and data source at once.
A scoped agentic workspace system starts with one named boundary. For example, an internal release-assistant workflow might include one approved runtime, one repository group, one documentation folder, one ticketing project, one communication draft path, a specific human approval step, and a defined set of logs and evidence records. The same organization may run many other AI tools. Those tools are not automatically in scope just because they exist nearby.
That boundary is what makes the rest of the review possible.
The scope record should answer basic questions before anyone argues about levels, controls, maturity, or claims:
- What workflow or workspace profile is being reviewed?
- Which runtimes, tools, skills, connectors, repositories, files, SaaS systems, shells, data categories, context sources, retrieval sources, memory stores, identities, and approval paths are in scope?
- Which adjacent systems, roles, resources, data sources, context sources, and action classes are explicitly out of scope?
- Who owns the workspace or endpoint boundary?
- Who owns the runtime and tool configuration?
- Who owns source, connector, or skill review?
- Who owns governance decisions, exceptions, and claim language?
- Who can produce the evidence packet when a reviewer asks what happened?
The exclusions matter as much as the inclusions. If the profile covers draft ticket updates but not production deployment, say so. If it covers one repository group but not the whole engineering organization, say so. If it covers a managed connector but not a user's personal browser session, say so. If it covers generated summaries but not customer-record exports, say so. A clear exclusion is not a weakness by itself. A hidden exclusion becomes a claim problem.
Ownership has to be explicit because agentic workspace security crosses team boundaries. The endpoint or workspace owner may control files, local app permissions, sandboxing, and device policy. The runtime owner may control tools, approval gates, action classes, and policy decisions. The source owner may control skills, scripts, connectors, prompts, and updates. The governance owner may decide whether a workflow is approved, what exceptions are accepted, and what wording is allowed. The evidence owner may be the only person who can assemble inventories, receipts, validation notes, exception records, and redacted review packets without leaking sensitive content.
That owner model prevents a common failure: every team assumes another team accepted the risk. The runtime team assumes the workspace owner limited file access. The workspace owner assumes the governance owner accepted the workflow. The governance owner assumes evidence exists. The evidence owner discovers that logs are partial, approvals are informal, and no one recorded which resources were excluded. A scoped profile makes those handoffs visible early.
In the awoss working draft, this is the job of AWOSS-SCP and AWOSS-GOV. AWOSS-SCP is the scope, inventory, and ownership family: it names the system, lists what is in scope, records explicit exclusions, and assigns responsibility for the main parts of the boundary. AWOSS-GOV is the governance and claim family: it records who accepts the boundary, who handles exceptions, who triggers reassessment, and who keeps statements from saying more than the evidence supports.
For an early readiness review, the first artifact does not need to be large. It can be a short profile record with:
- system name and review date
- intended workflow or use case
- in-scope runtimes, tools, connectors, repositories, files, applications, data categories, context sources, identities, and approval paths
- explicit out-of-scope systems, data sources, roles, and action classes
- owner matrix for workspace, runtime, source, governance, and evidence
- evidence owner list for inventories, policies, receipts, validation notes, exceptions, and claim-limit records
- known assumptions, open gaps, and unsupported paths
- allowed internal wording and prohibited stronger claims
That record does not prove the system is safe. It gives the review something honest to point at. It says: this is the system we mean, this is what we know about it, this is who owns each part, this is what we did not review, and this is the boundary beyond which stronger claims stop.
The scoped system is the place where the earlier questions become answerable. What could the agent reach? Look at the inventory and exclusions. Whose authority did it use? Look at the identity and approval paths. Which sources and context could steer it? Look at the source and context records. What evidence can reconstruct the action path? Ask the evidence owner for the packet. Which gaps block stronger claims? Read the exception and claim-limit records.
Without that unit, agent security discussions drift back into abstractions. With it, a team can start with one bounded workflow, identify what is known, make the unknowns visible, and decide whether the next step is remediation, validation, governance review, or a narrower claim.
Five Failure Modes A Workspace Profile Has To Make Visible
A scoped agentic workspace system is useful because it turns vague concern into reviewable failure modes.
That does not mean the profile fixes every failure by existing. It means the profile should make the failure hard to hide. If the team cannot say what the agent can reach, whose authority it uses, which policy governs high-impact actions, which sources can steer behavior, or whether evidence packets expose new sensitive content, the right answer is not a stronger claim. The right answer is a visible gap.
Five failure modes are especially common in agentic workspace reviews.
1. Unknown Reach
Unknown reach is the failure mode where nobody can clearly say what the agent can access.
The visible workflow may be small: summarize a document, update a ticket, run a test, draft a message. The available reach may be much larger: local folders, synced drives, repositories, browser sessions, shell commands, SaaS connectors, network locations, package managers, screenshots, logs, and generated files. If the inventory is missing, the review is forced to guess.
That guess is dangerous because agentic work can combine access paths. A runtime may look narrow until it inherits a user's working directory. A connector may look read-only until a downstream action can modify records. A shell command may look local until it reads environment variables, traverses a mount, opens a browser profile, or calls a network endpoint. A browser session may look like a convenience until it reaches authenticated applications outside the intended scope.
The workspace profile should make the reachable boundary visible. At minimum, it should point to a reachable-resource inventory, execution-capability inventory, connector or application scope, network or egress note, and known exclusion list. It should also say where the boundary is only described rather than technically enforced.
The relevant awoss anchors are AWOSS-SCP and AWOSS-WSB: scope, inventory, ownership, workspace resources, execution capabilities, boundary controls, and explicit exclusions.
The claim limit is simple: an inventory supports reachability review. It does not prove the boundary is complete, enforced, escape-proof, least privilege, or safe for every workflow.
2. Ambiguous Authority
Ambiguous authority is the failure mode where the agent's actions cannot be clearly tied to a user, role, workflow, service account, shared account, system identity, agent identity, or approval path.
This matters because the same visible action can mean different things depending on authority. A draft created under a user's session is different from a production change made through a service account. A ticket update made by a named workflow is different from a connector action that appears as a shared integration user. An external message prepared for human approval is different from a message sent automatically through delegated authority.
When authority is ambiguous, accountability becomes ambiguous too. The workspace owner may think the runtime owner limited the agent. The runtime owner may think identity policy limited the account. The business owner may think a human was approving high-impact actions. The evidence owner may later find records that show an action occurred but not who delegated it, whose authority it used, whether the approver was separate, or whether the authority was broader than the reviewed purpose.
The profile should require a plain identity and authority model. It should distinguish ordinary user delegation, administrative authority, production authority, sensitive-data authority, external-communication authority, service or workflow authority, and any shared-account path. It should also identify delegator and approver roles, especially where the same person or system can request, approve, and execute a high-impact action.
The relevant awoss anchors are AWOSS-DEL and AWOSS-RUN: delegated authority, identity model, high-authority action classes, approver roles, runtime control mapping, and approval receipts.
The claim limit: an authority model explains who or what appears to act. It does not prove least privilege, independence, legal responsibility, separation-of-duties sufficiency, or correct runtime enforcement unless supporting policy, identity, and receipt evidence exists.
3. Action Without Reviewable Policy
Action without reviewable policy is the failure mode where the agent can do something important, but the allow, deny, pause, approval, stop, rollback, or record path is missing or hard to reconstruct.
This is where agentic work becomes operational risk. The important question is not only whether a policy document says "humans approve risky actions." The question is what actually happens when the agent asks to run a broad shell command, write to production, send an external message, export sensitive data, change access, spend money, delegate to another agent, or invoke a connector with side effects.
If the policy is not reviewable, the team may not know which actions are high impact, which actions are denied by default, which actions require approval, what happens when approval expires, whether a user can bypass the gate, how an action can be stopped, whether rollback was possible, or whether the denied path left a receipt. A transcript may show the final text. It may not show the runtime decision path.
The profile should make the action policy inspectable. That means naming high-impact action classes, mapping those classes to required controls, describing the runtime mediation capabilities, and preserving receipts for policy outcomes. For production or high-impact workflows, it should also connect the policy to validation evidence: approval-gate tests, denied-action tests, rollback or emergency-stop exercises, finding records, and retest triggers.
The relevant awoss anchors are AWOSS-RUN, AWOSS-LOG, and AWOSS-VAL: runtime policy, approval gates, policy-outcome records, high-impact action receipts, reconstruction, tests, findings, and retest evidence.
The claim limit: reviewable policy and receipts can support selected runtime control evidence. They do not prove every action path is mediated, every approval is appropriate, every rollback succeeds, every downstream side effect is reversible, or every abuse case has been tested.
4. Hidden Sources And Context
Hidden sources and context are the failure mode where the agent is steered by inputs the review does not see or rank.
Agents are not influenced only by the latest user prompt. They may use system instructions, project rules, skill instructions, prompt packs, tool descriptions, retrieved documents, search results, connector responses, memory, conversation history, handoff notes, generated summaries, source code, package metadata, and tool outputs. Some of those sources should be able to set policy. Some should only provide data. Some should be lower trust. Some should be excluded from high-impact workflows.
When this is unclear, a retrieved document can quietly act like an instruction. A stale memory can carry an old decision into a new workflow. A tool output can tell the agent to ignore a boundary. A connector update can add a new behavior. A prompt pack or skill can drift from the reviewed version. A handoff can move private context into a place where another agent treats it as trusted.
The profile should make source and context trust visible. It should include an action-unit inventory, source register, origin or version record, permission and capability declarations, context-source inventory, instruction precedence rules, prohibited context storage locations, memory-write policy where applicable, and clean-context or lower-trust override tests for high-impact workflows.
The relevant awoss anchors are AWOSS-SRC and AWOSS-CTX: source trust, permissions, provenance, drift, instruction precedence, memory and retrieval boundaries, lower-trust override controls, and context-boundary validation.
The claim limit: source and context records help reviewers understand what can steer the agent. They do not prove every hidden supplier input is visible, every prompt-injection path is blocked, every dependency is safe, every memory record is clean, or every context conflict is handled correctly.
5. Evidence Becomes The Next Data Leak
Evidence leakage is the failure mode where the review packet becomes a new place where sensitive material is copied, retained, or shared.
This risk is easy to miss because evidence sounds responsible. A team wants to show prompts, traces, screenshots, transcripts, tool outputs, connector responses, source files, approval messages, DLP events, logs, and generated summaries. Those artifacts can be useful. They can also contain secrets, session material, customer records, private source code, personal data, confidential business details, hidden instructions, full document text, or unnecessary payloads that the reviewer does not need.
The profile should prefer redaction-safe evidence. In many cases, the useful artifact is not the raw content. It is a stable event ID, timestamp, action class, actor or system identity, policy outcome, approval state, resource scope, source version, sensitive-data handling outcome, validation finding, hash, protected reference, or short derived summary. Raw payloads should stay in protected evidence stores only when they are truly needed by an authorized reviewer.
This matters for claim language too. If the evidence packet cannot be shared without copying sensitive content, the profile should say so. If the review depends on protected references instead of raw artifacts, the profile should say so. If logs intentionally omit payloads to avoid creating a second leak, the profile should say so. A safe omission with a documented derived receipt is usually better than a complete-looking packet that spreads private material.
The relevant awoss anchors are AWOSS-SEC, AWOSS-LOG, and AWOSS-GOV: sensitive-location inventory, prohibited storage, sensitive-export classification, redaction policy, sensitive-safe receipts, protected evidence references, retention rules, exception records, and claim limits.
The claim limit: redaction-safe evidence supports review without unnecessary payload exposure. It does not prove the underlying data handling is legally sufficient, all secrets were detected, all historical logs are clean, all reviewers have appropriate access, or every sensitive path is controlled.
Why These Five Matter
These failure modes are connected. Unknown reach makes policy incomplete. Ambiguous authority makes logs hard to interpret. Hidden context can trigger actions that the policy never classified. Evidence leakage can turn a careful review into a new exposure. A scoped profile gives the team one place to name those gaps without pretending they are already fixed.
That is the practical value of the workspace security profile: not a public badge, not a blanket assurance statement, and not a replacement for other controls. It is a disciplined way to ask, for one bounded agentic workflow: what can it reach, whose authority does it use, what happens before important actions, what can steer it, what evidence exists, and what does that evidence not support?
What A Workspace Security Profile Looks Like
After the failure modes are visible, the next question is practical: what does a useful workspace security profile actually look like?
It should not start as a public badge or a broad statement about every agent in the organization. It should start as a small review packet for one scoped agentic workflow. The packet should let a reviewer see what was included, what was excluded, what evidence exists, what evidence is missing, what findings remain open, and what claims the packet does not support.
In other words, the first useful profile is closer to an evidence brief than a marketing asset. It is a way to make a bounded workflow inspectable.
Start With One Scoped Workflow
A workspace security profile starts by naming the review unit.
That review unit should be narrow enough that the team can draw a real boundary. A useful example might be: an internal agentic workflow that reads a specific project folder, uses a defined repository, runs tests in a controlled environment, drafts a ticket, and prepares a message for human approval before sending. The profile does not need to cover every AI tool, every employee, or every future workflow in the company.
The first page should answer simple questions:
- What workflow is being reviewed?
- Which runtime, workspace, repository set, connector set, memory source, or tool boundary is in scope?
- Which files, applications, communication channels, network paths, and external services are out of scope?
- Who owns the workspace boundary, runtime policy, source and context records, evidence packet, governance decisions, and claim approval?
- Which review period, release, event, or configuration snapshot does the packet cover?
The profile should also declare an internal posture. That might be a narrow Level 1-style review, selected Level 2-style slices, selected Level 3-style slices, or simply "not ready." Those words should stay internal and careful at this stage. They are a planning shorthand for review depth, not a public assurance level.
Build A Redaction-Safe Evidence Packet
The core artifact is a redaction-safe evidence packet.
The packet should preserve the structure needed for review without copying raw sensitive content by default. Reviewers usually do not need full prompts, complete customer records, secrets, raw transcripts, browser session material, source documents, screenshots, or connector payloads in the first packet. They need stable references, categories, policy decisions, owners, timestamps, receipt IDs, hashes where useful, source versions, sensitive-data handling outcomes, findings, retest status, and protected-reference paths.
One simple packet shape could look like this:
profile.yaml: packet ID, scoped workflow, review period, target internal posture, owners, version, and explicit non-claims.scope.md: in-scope and out-of-scope resources, review assumptions, exclusions, inherited controls, and known boundary gaps.action-units.json: agent runtimes, skills, tools, connectors, workflows, source versions, permissions, declared capabilities, and source-trust state.context-graph.json: context sources, retrieval stores, memory sources, instruction precedence, trust tiers, and clean-context assumptions.runtime-policy.md: high-impact action classes, allow and deny rules, approval gates, interrupt and rollback expectations, unsupported actions, and policy-change history.receipts.jsonl: redaction-safe event receipts for relevant requests, decisions, approvals, denials, actions, sensitive-data handling, and evidence exports.validation-report.md: tests, sampled evidence checks, reconstruction exercises, findings, retests, residual risks, and limitations.exceptions.md: unsupported paths, missing evidence, accepted internal-review exceptions, owners, expiry dates, remediation paths, and claim impacts.claim-language.md: allowed internal wording, prohibited public wording, and approval gates before reuse outside the internal review.
The exact file names are less important than the discipline. A packet should make the chain of review visible: boundary, authority, action policy, source trust, context trust, sensitive-data handling, receipts, validation, exception handling, and claim limits.
Keep Unsupported Paths Visible
Unsupported paths should be first-class review objects.
An unsupported path is not automatically a disaster. It may be a connector that does not emit enough receipts, a shell mode that cannot be safely mediated yet, a browser action that cannot be reconstructed, a memory store with unclear retention, a source dependency that cannot be pinned, a sensitive-data flow that requires protected review, or a production change path that needs a separate approval model.
The problem is not that every unsupported path exists. The problem is when it is hidden.
A useful profile should name unsupported paths in scope.md, runtime-policy.md, validation-report.md, or exceptions.md. For each one, the packet should record what is unsupported, which AWOSS-* family is affected, who owns the gap, what evidence exists, what evidence is missing, whether the path is blocked, limited, or accepted for internal review only, and what claim impact follows.
This changes the conversation. Instead of asking whether the workflow is "secure," the team can ask a better question: which parts are supported by evidence, which parts are unsupported, and which parts block stronger wording?
Treat Findings As Useful Outputs
Findings are not a sign that the profile failed. They are one of the main reasons to build the profile.
A finding should be concrete enough to improve the scoped workflow. It should identify the affected family, the affected artifact, severity, evidence reference, owner, remediation path, retest method, retest date, residual risk, and claim impact. A finding that says "logging incomplete" is less useful than one that says: "High-impact approval receipts do not preserve policy version or approver role for this workflow, which limits reconstruction and blocks any internal wording stronger than sampled review input until retested."
Good findings also separate control gaps from evidence gaps. A workflow may have a real approval gate but weak receipts. Another workflow may have good logs but no enforced denial path. A third may have source records but no context-drift test. The profile should distinguish those cases so remediation does not become generic.
This is where AWOSS-VAL matters. Validation should not only test whether a happy path worked. It should test denied actions, boundary attempts, approval-gate behavior, source or context drift, redaction behavior, reconstruction quality, and whether prior findings were actually retested.
Convert Material Gaps Into Exceptions
Some gaps will not be fixed before the packet is reviewed. The profile should not bury them in prose.
Material gaps should become exception records with owners, rationale, expiry or review dates, remediation paths, reassessment triggers, and claim impact. An exception can say that a limitation is accepted for internal review only. It cannot turn missing evidence into proof.
For example, if a runtime cannot emit authoritative receipts for a specific connector, the packet can record that as a missing-receipt exception. It can link to whatever metadata exists, name the runtime owner, describe the temporary control, set a remediation target, and say which claim is limited. It should not imply complete runtime mediation.
If a sensitive-data class cannot be safely copied into the review packet, the packet can use a protected evidence reference and a reviewer access path. It should not copy the raw sensitive content just to make the packet look complete.
If a source package cannot be pinned or externally verified, the packet can record the local review state, version notes, allowed runtime scope, and drift monitoring plan. It should not treat unverifiable source trust as high confidence or strong evidence.
Put Claim Limits In The Packet Itself
The packet should end with claim language, not leave it for later.
A internal profile can support careful wording such as: this packet maps a bounded workflow to selected AWOSS-* candidate families; it supports review of selected controls; it identifies unsupported paths, findings, exceptions, and claim limits; it is ready for internal review; or it is ready for internal review with material exceptions.
It should also state what does not follow. The packet does not prove awoss certification, conformance, legal compliance, production safety, complete agent security, complete logging, complete runtime mediation, complete source-trust coverage, complete sensitive-data protection, external-framework equivalence, endorsement, partner backing, or sponsor support.
This is not just legal caution. It is part of the security model. Agentic workspace reviews can become misleading when evidence, exceptions, and claims are separated. The evidence packet should say what it supports, what it does not support, and which unresolved gaps would need to change before stronger wording could be considered.
What The Profile Gives The Team
A good workspace security profile gives the team a reviewable object.
It lets the platform owner see runtime limits. It lets the workspace owner see reachability. It lets the source owner see action-unit and dependency state. It lets the security owner see sensitive-data handling and redaction behavior. It lets the evidence owner see whether receipts and reconstruction are useful. It lets the governance owner see open exceptions, reassessment triggers, and claim boundaries.
Most importantly, it keeps the scope honest. A small packet with visible limitations is more useful than a broad security statement that no one can reconstruct. The profile does not need to prove everything. It needs to make the reviewed workflow, evidence, findings, exceptions, and claim limits clear enough that the next decision is better informed than the last one.
The Ten AWOSS Families In One Pass
The profile-packet model becomes easier to use when the ten AWOSS-* families are treated as ten review questions.
They are not meant to replace the working draft. They are a map for reading it. Each family points at one part of the agentic workspace problem: scope, authority, boundaries, runtime decisions, source trust, context, sensitive data, receipts, validation, and governance.
The point is not to memorize family codes. The point is to keep the review from collapsing into one vague question like "is the agent secure?" A useful workspace profile asks narrower questions and collects narrower evidence.
AWOSS-SCP: What Exactly Is Inside The Review Boundary?
AWOSS-SCP is the scope family. It asks what system is being reviewed before anyone discusses evidence or claims.
For a workspace profile, this means naming the workflow, runtime, tools, connectors, repositories, files, applications, context sources, memory sources, human roles, approvers, administrators, evidence owners, and exclusions. It also means saying what a reviewer might wrongly assume is included but is actually outside the packet.
Useful artifacts here are simple: a scope record, owner matrix, intended-use note, in-scope and out-of-scope list, basic component map, and boundary-change notes where the profile is used more than once.
That gives the review a target. It still does not prove that access is enforced, actions are authorized, logs are complete, or controls are effective.
AWOSS-DEL: Whose Authority Is The Agent Using?
AWOSS-DEL is the authority family. It asks who gave the agent permission to act, what identity the agent appears to use, and who approves high-impact actions.
This matters because agent work can look like user activity, workflow activity, service-account activity, shared-account activity, or system activity. A reviewer needs to know which one applies. A draft message prepared for approval is different from a message sent automatically. A test command in a sandbox is different from a production change made through a privileged identity.
For evidence, look for an identity and authority model, high-authority action-class list, delegator and approver role matrix, service-account or workflow-identity record, and sampled high-impact approval receipts.
Those records explain how authority is supposed to work. By themselves, they do not prove least privilege, separation of duties, or runtime enforcement.
AWOSS-WSB: What Rooms Does The Agent Have Keys To?
AWOSS-WSB is the workspace and execution boundary family. It asks what the agent can reach and what keeps it from crossing into places or actions that were not approved.
The "rooms" can be local folders, repositories, browser sessions, SaaS apps, shared drives, shells, scripts, package managers, network locations, connectors, hosted services, and generated files. The same workflow can read in one room, write in another, execute in a third, and transmit from a fourth. The boundary has to make those differences visible.
Reviewers will usually want a reachable-resource inventory, execution-capability inventory, sandbox or environment profile, repository or connector scope, network or egress note, known-exclusion list, denied-action records, and boundary validation report for higher-impact use.
That material supports reachability and containment review. It should not be read as proof that every bypass path was found or that the workspace is escape-proof.
AWOSS-RUN: What Must Pause, Ask, Stop, Or Roll Back Before Impact?
AWOSS-RUN is the runtime policy family. It asks what happens at the moment an agent tries to act.
Inventories and source reviews are not enough if the runtime can still perform high-impact actions without a reviewable decision. The profile should show which action classes are high impact, which are denied by default, which require approval, which can be interrupted, which can be rolled back, and which are unsupported.
Evidence usually comes from a high-impact action taxonomy, runtime action policy, approval-gate configuration, denied-action log, policy-outcome receipt, emergency-stop or session-cancel procedure, rollback note, and tests for critical policy paths.
That can support review of selected action-control paths. It does not mean every action is mediated, every approval is appropriate, or every downstream effect is reversible.
AWOSS-SRC: Which Reusable Tools And Connectors Can Act?
AWOSS-SRC is the source-trust family. It asks where the reusable pieces came from, what they can do, which version is approved, and what happens when they change.
Agentic work often depends on reusable action units: skills, tools, connectors, plugins, prompt packs, scripts, packages, model components, integrations, and supplier-provided components. Those units can request permissions, make external calls, change behavior through updates, or expand a dependency chain. A profile should not treat them as invisible plumbing.
Good review material includes an action-unit inventory, source register, publisher or maintainer record, permission declaration, dependency manifest, approved version or commit record, update review, drift comparison, and rollback or retirement path for unsafe or unsupported components.
Those records help with origin, permission, version, and change review. They do not prove every component is safe, every dependency is trustworthy, or any registry or marketplace has endorsed the component.
AWOSS-CTX: Which Context Can Steer The Agent?
AWOSS-CTX is the context and instruction boundary family. It asks what information can influence agent behavior and which sources are allowed to set policy.
Agents do not only respond to the latest user message. They can combine system instructions, project rules, skill instructions, retrieved documents, memory, conversation history, handoff notes, tool outputs, search results, and external content. Some sources should steer policy. Some should only provide data. Some should be treated as lower trust. Some should be excluded from high-impact workflows.
Evidence can include a context-source inventory, instruction precedence rule, memory-write policy, prohibited-storage list, retrieval or corpus-change record, sanitized handoff example, clean-context configuration, and tests for prompt injection, retrieval poisoning, tool-output poisoning, or stale memory in high-impact workflows.
That makes the steering surface visible. It does not prove every indirect instruction attack is blocked or every memory and retrieval interaction is safe.
AWOSS-SEC: How Are Secrets And Sensitive Data Kept Out Of Spillover?
AWOSS-SEC is the secrets and sensitive-data family. It asks where sensitive material can appear, where it must not be copied, and how evidence can remain useful without becoming a new leak.
Sensitive material can include credentials, tokens, private keys, browser sessions, confidential files, customer records, regulated data, private source code, screenshots, transcripts, summaries, and generated evidence. Agentic work can move that material through prompts, tool calls, memory, logs, summaries, notes, review packets, and external messages.
Review material might include a sensitive-location inventory, sensitive-data class register, prohibited-storage policy, redaction or masking rule, protected evidence reference, sensitive-export approval or denial receipt, credential teardown record, entitlement note, and redaction or denied-exfiltration test.
That supports review of selected handling paths. It is not a legal conclusion, complete data-protection proof, complete secret-detection result, or historical log-cleanup guarantee.
AWOSS-LOG: Can The Action Path Be Reconstructed Without Raw Payload Leakage?
AWOSS-LOG is the logs, receipts, and traceability family. It asks whether an authorized reviewer can reconstruct the important parts of agentic work without receiving unnecessary sensitive content.
A transcript alone is rarely enough. A useful record may need the user or workflow request, runtime identifier, tool or connector used, source version, policy outcome, approval state, action class, resource scope, sensitive-data handling outcome, downstream request ID, validation link, and stable redaction-safe reference.
A strong packet usually includes a log source inventory, high-impact receipt schema, sampled runtime and approval receipts, policy-decision records, correlation map, retention policy, redacted review packet, missing-field remediation record, and reconstruction test result.
That is enough to reconstruct selected events. It does not show that every relevant event was captured, every log is tamper-resistant, or every raw payload can be safely shared.
AWOSS-VAL: Have Controls And Denied Paths Actually Been Tested?
AWOSS-VAL is the validation family. It asks whether the important controls were actually checked, what failed, what remains untested, and what was retested after changes.
Agent controls can look reasonable on paper and still fail in a real workflow. The profile should not treat a policy note, one screenshot, or one successful happy path as complete validation. It should show which controls were checked by documentation, configuration inspection, sampled evidence, manual test, automated test, monitoring review, or not checked at all.
Useful validation material includes a coverage matrix, review artifact, assumptions record, untested-control list, pre-production test plan, fixture results, denied-action test, approval-gate test, sensitive-data test, finding tracker, retest record, drift review, and tabletop or adversarial exercise for high-impact use.
These records are review input. They are not an audit, certification, assessor conclusion, or guarantee that prompt injection, data leakage, unsafe tool use, source drift, logging gaps, or governance failure are absent.
AWOSS-GOV: Who Owns Exceptions, Changes, And Claims?
AWOSS-GOV is the governance family. It asks who owns the risk decisions around agentic work and who prevents the organization from saying more than the evidence supports.
This family connects the technical record to accountable decisions. Someone has to approve the boundary. Someone has to accept or reject exceptions. Someone has to decide when the profile must be reviewed again. Someone has to control public wording, partner wording, procurement wording, and external mapping statements.
Governance evidence can include an owner record, boundary or target-posture signoff, exception register, assumption log, risk-acceptance record, remediation plan, review trigger policy, supplier or provider change review, claim-limit record, and reassessment note.
Those records make decisions visible. They do not create legal compliance, external assurance, public certification, partner approval, or a complete governance program by themselves.
How The Families Work Together
The families are useful because they force the review to stay connected.
AWOSS-SCP names the boundary. AWOSS-DEL explains authority. AWOSS-WSB describes reachable workspace and execution paths. AWOSS-RUN handles the moment of action. AWOSS-SRC tracks the reusable pieces that can act. AWOSS-CTX tracks what can steer the agent. AWOSS-SEC keeps sensitive material from spilling into unsafe places. AWOSS-LOG keeps receipts. AWOSS-VAL checks whether the controls and denied paths work. AWOSS-GOV keeps exceptions, changes, and claims accountable.
That is the mental model. A workspace security profile does not need to turn every family into a long checklist on day one. It should use the families to ask better questions for one bounded workflow, collect evidence that is safe to review, record what is unsupported, and keep claim language inside what the packet actually supports.
Levels Without Audit Language
The working draft uses Level 1, Level 2, and Level 3 as a way to describe increasing review depth. That language is useful, but it can also be misread.
For this deep dive, the safer framing is not "this workflow is Level 2" or "this tool passes Level 3." The safer framing is: a team can use Level 1-style, Level 2-style, and Level 3-style slices to decide how much structure an internal readiness review needs for one bounded workflow.
That distinction matters. A level label is easy to turn into a claim. A level slice is a planning tool. It helps a team decide what evidence to collect, which gaps matter, which exceptions block stronger wording, and when the workflow should be treated as not ready.
Level 1-Style Review: Can We Name The System?
A Level 1-style review is the foundation pass.
The team should be able to name the scoped workflow, owners, authority model, reachable resources, reusable action units, context sources, sensitive-data handling expectations, logging expectations, validation gaps, and claim limits for one bounded system.
This is not trivial. Many agentic workflows fail this first pass because the agent's reach is implicit, authority is inherited from a user session, tools and connectors are not inventoried, memory and retrieval are not separated from policy, logs are mostly transcripts, and evidence packets have no clear redaction rule.
A practical Level 1-style packet should answer:
- What is the scoped workflow?
- Which runtime, workspace, connector set, repository set, memory source, and context source are in scope?
- Which resources, identities, tools, and action classes are explicitly out of scope?
- Who owns workspace boundaries, runtime policy, source records, context records, evidence, validation, exceptions, and claim approval?
- Which actions are high impact enough to need review before expansion?
- Where should secrets and sensitive data not be copied?
- Which logs or receipts exist, and which important paths are not recorded?
- Which controls were not tested?
- What wording is blocked until more evidence exists?
The useful output is not a badge. It is a scope record, owner matrix, authority note, resource inventory, source and context inventory, sensitive-data rule, basic receipt expectation, validation-gap list, and claim-limit statement.
The claim limit: a Level 1-style review supports internal understanding of a bounded workflow. It does not prove readiness for production use, complete control effectiveness, public conformance, certification, audit success, legal compliance, or complete agent security.
Level 2-Style Slices: Can Production Use Stay Repeatable?
A Level 2-style slice is about managed production behavior.
This does not mean the whole organization has achieved a public Level 2. It means selected parts of one scoped workflow have repeatable records and controls that are strong enough to review over time. The team can show that the workflow is not only described once, but operated with maintained inventories, approval paths, policy outcomes, source drift checks, context controls, validation findings, and managed exceptions.
For example, a production workflow might use Level 2-style slices for runtime policy and logging while still treating source integrity or context isolation as open gaps. Another workflow might have good source review and approval receipts but still need stronger sensitive-data export tests. The slice language keeps the packet honest: it says which areas have managed evidence and which areas do not.
A practical Level 2-style packet should show:
- repeatable inventories for reachable resources, action units, context sources, and sensitive-data locations
- approval rules for high-impact actions
- sampled approval, denial, and policy-outcome receipts
- source introduction or update review for high-impact tools and connectors
- context or memory change controls where durable context affects decisions
- redaction-safe evidence handling and retention expectations
- validation findings, owners, remediation paths, and retest records
- exception tracking with claim impact and review dates
- periodic or trigger-based review by named owners
The useful output is a set of selected managed slices. A packet might say that it has Level 2-style runtime-policy evidence for one action class, Level 2-style logging evidence for sampled approval paths, or Level 2-style source-drift records for a connector set. That is narrower and safer than saying the system "is Level 2."
The claim limit: Level 2-style slices support review of selected managed controls. They do not prove full production safety, full coverage across every family, audit readiness, public conformance, legal compliance, or complete runtime mediation.
Level 3-Style Slices: What Needs Stronger Challenge?
A Level 3-style slice is for high-impact use where mistakes, abuse, or gaps could materially affect production systems, sensitive data, external commitments, access control, regulated workflows, business continuity, or trust in evidence.
Again, this should be used carefully. The phrase should not become a public claim. It is an internal way to say: this part of the workflow needs stronger evidence, stronger separation, stronger record protection, recurring testing, or a more formal reassessment path.
Level 3-style slices may include:
- historical scope records so reviewers can reconstruct what was reachable at the relevant time
- separated review for high-impact boundary, authority, source, runtime, or governance changes
- stronger source integrity controls such as pinned versions, lockfiles, checksums, signatures, attestations, controlled release channels, or deeper source review where practical
- pre-execution mediation for high-impact action paths
- tamper-evident, independently retained, or separation-controlled records for high-impact approvals and actions
- reconstruction tests that join scope, authority, source, context, runtime, sensitive-data, log, validation, and governance records
- recurring validation, adversarial tests, tabletop exercises, or incident exercises for material abuse paths
- emergency stop, rollback, evidence-access, exception-expiry, and reassessment exercises
- stronger governance for persistent exceptions and external wording
A Level 3-style slice is useful when a team wants to know whether selected high-impact evidence would survive deeper review. It can show where a packet is improving toward stronger review evidence. It can also show that a workflow should not expand until a gap is fixed.
The claim limit: Level 3-style slices are still internal review inputs in the current working-draft posture. They do not create independent assurance, approved auditor status, certification, public conformance, legal compliance, external-framework equivalence, or permission to claim high-assurance status.
Use Levels To Reduce Overclaiming
The safest way to use levels now is to make them narrower, not broader.
Instead of saying "we are Level 1," say "this packet supports a Level 1-style scope and owner review for this bounded workflow." Instead of saying "the runtime passes Level 2," say "the packet contains Level 2-style approval and denial receipts for these high-impact action classes." Instead of saying "Level 3-ready," say "this workflow has selected Level 3-style reconstruction and record-protection evidence, with these exceptions still open."
This wording does two things. It helps a reviewer understand the depth of the evidence, and it prevents an internal readiness packet from turning into a public assurance claim.
It also gives the team a useful "not ready" option. If the workflow cannot name its scope, authority, high-impact actions, context sources, sensitive-data paths, receipts, validation gaps, and claim limits, the right posture is not Level 1-style. It is not ready. If production use lacks repeatable receipts, approval records, source drift checks, or exception handling, the right posture is not Level 2-style. If high-impact use cannot reconstruct decisions, protect records, separate review, or retest material paths, the right posture is not Level 3-style.
Levels are helpful when they make limits visible. They become dangerous when they hide limits behind a simple label.
What This Means For A Workspace Profile
A workspace profile can use levels as an internal organizing aid:
- Level 1-style: do we know what this bounded system is and where its obvious gaps are?
- Level 2-style: do selected production paths have repeatable controls, receipts, findings, retests, and managed exceptions?
- Level 3-style: do selected high-impact paths have stronger review, reconstruction, record protection, recurring validation, and formal reassessment triggers?
That is enough for the current working-draft posture. A future released standard may define public conformance profiles, required families, required levels, validation methods, expiry rules, assessor roles, or claim language. Until then, level language should stay narrow and tied to evidence.
Start With One Workflow
The practical starting point is not to secure every agent, every tool, every workspace, and every future use case at once.
Start with one workflow.
Pick a workflow that is real enough to matter and narrow enough to draw. It might be an internal agentic workflow that reads a project folder, opens a repository, runs a bounded command, drafts a ticket, prepares a message for human approval, or gathers evidence for a routine review. The exact example is less important than the boundary. If the team cannot say where the workflow starts and ends, the review will become a broad policy conversation before it becomes useful.
For that one workflow, make the first profile small and concrete:
- Draw the scope.
- Name the owners.
- Name the authority the agent uses.
- List reachable files, repositories, apps, connectors, shells, network paths, communication channels, retrieval stores, memory sources, and other context sources.
- Identify high-impact actions.
- Write the approval, denial, interrupt, rollback, and unsupported-action policy in reviewable language.
- Collect redaction-safe receipts, validation findings, reconstruction notes, and exception records.
- Record claim limits in the packet itself.
That is not a public badge. It is an internal review surface.
The First Review Question
The first useful question is simple:
If a reviewer asked tomorrow what the agent could reach, who approved its high-impact actions, what evidence exists, and which claims are blocked, could the team answer without exposing secrets?
If the answer is yes, the team has the beginning of a profile. It can still have findings, exceptions, unsupported paths, and "not ready" decisions. Those are not failures. They are the reason the profile exists.
If the answer is no, the next step is not stronger language. The next step is to make the boundary reviewable:
- name the scoped workflow
- separate in-scope resources from out-of-scope resources
- identify the delegator, approver, runtime owner, source owner, context owner, evidence owner, and governance owner
- define high-impact actions before the next run
- require receipts for approval, denial, policy outcomes, sensitive-data handling, validation, and exceptions
- keep sensitive payloads out of the evidence packet unless a protected review path is explicitly needed
- write down which statements the evidence does not support
This gives the team a concrete path from "we use agents" to "we can review one agentic workspace workflow."
What Good Looks Like At This Stage
Good does not mean complete. At this stage, good means honest, scoped, and useful.
A good internal packet might say that the scoped workflow has a clear boundary, named owners, a reachable-resource inventory, a source and context record, a runtime policy, sampled receipts, a validation report, and a claim-language file.
It might also say that the shell path is unsupported, one connector does not emit enough receipts, a memory source needs retention review, a high-impact action class lacks a rollback test, or a sensitive-data path requires a protected evidence reference instead of a copied payload.
That is still progress. The packet has turned uncertainty into reviewable work. It tells the team which gaps block stronger wording, which gaps can be accepted only for internal review, which gaps require remediation, and which next test would make the evidence better.
The useful outputs are practical decisions:
- ready for internal review
- ready for internal review with material exceptions
- not ready because scope, owners, evidence, runtime policy, receipts, sensitive-data handling, validation, or governance records are missing
- approval needed before any external use, publication, outreach, public wording, implementation claim, or third-party reference
Those decisions are intentionally modest. They keep the profile useful before there is a released certification path, public claim model, assessor role, validator role, or legal-compliance conclusion.
The Narrow Promise
The narrow promise of awoss is that agentic workspace risk can be made easier to review when the boundary, authority, resources, sources, context, runtime policy, evidence, findings, exceptions, and claim limits are kept together.
That is enough for a first internal review.
Do one workflow. Make the boundary visible. Preserve the evidence without leaking the work. Write down what the packet supports and what it does not. Then decide whether the next useful step is remediation, a deeper slice, a different workflow, or internal review of the scoped profile.
The work gets clearer when the claim gets smaller.
Further Reading
The deep dive references several external standards, frameworks, and comparator projects as context. These links are included for orientation and do not imply equivalence, endorsement, certification, or legal compliance.