Table of Contents
Ask a CISO how many AI agents are running in their environment.
They cannot answer.
Not because the data is hidden. Because the system was never designed to make it visible.
This is not a tooling gap. It is an architectural blind spot.
We analyzed the production source code of the most widely-used AI coding agent to understand why this visibility gap exists. What we found: the same architectural patterns that make agents powerful -- spawning, delegation, dynamic connectivity, persistence -- are the patterns that make them invisible to every security tool deployed today.
Section 1: Why Agents Are Invisible
These are not edge cases. These are core features.
They Spawn Other Agents
The AgentTool accepts a run_in_background flag. When set, the spawned agent executes asynchronously. The parent continues. The user may never see the child's activity. After a configurable timeout (120 seconds by default), agents can be automatically pushed to background execution without explicit user action.
The architecture supports multiple isolation modes: worktree (separate git worktree), remote (separate compute environment), and default (same process). RemoteAgentTask manages agents in remote sessions with their own IDs, command histories, and lifecycles -- disconnected from the local environment that spawned them.
The "teammate" model runs agents in parallel tmux sessions (TeammateSpawnedOutput includes tmux_session_name, tmux_window_name, tmux_pane_id). Fully autonomous agents operating in parallel terminals, each with their own tools and context.
And scheduled agents: ScheduleCronTool supports durable schedules that persist to .claude/scheduled_tasks.json and survive restarts. A durable cron task can fire at 3 AM, spawn subagents, execute commands, modify files -- all without a human.
Background agents are not visible processes. They are invisible execution contexts. There is no global registry because the system was not designed to have one.
They Connect to External Services Dynamically
MCP connections are established at agent startup and torn down at completion. initializeAgentMcpServers merges parent connections with agent-specific servers -- additive-only. A subagent's MCP connections are invisible to the parent's monitoring context.
Plugin-provided MCP servers add another layer. The plugin loader discovers plugins from marketplaces, git repositories, and local directories. Each plugin can bundle MCP servers, agent definitions, commands, and hooks. A single plugin installation expands the attack surface in ways no existing security tool catalogs.
Each MCP connection expands the control plane. And none of them are centrally tracked.
They Operate Across Trust Boundaries
A single agent session can span multiple repositories, services, and API endpoints. Agents are loaded from six sources: built-in, plugin, user settings, project settings, policy settings, and flag settings. Each has different trust characteristics. getActiveAgentsFromList merges all into a flat list where later sources override earlier ones.
A project-level agent definition can shadow a built-in agent, changing its behavior for everyone who clones that repository.
Trust does not stop at the agent boundary. It propagates.
Traditional Tooling Is Blind
EDR sees process execution -- a Node.js process spawning a child. It cannot tell you the child is a subagent with a specific prompt, inherited permissions, three MCP connections, executing as part of a scheduled cron task.
Network monitoring sees HTTP requests. It cannot distinguish MCP protocol exchanges from other HTTP traffic or correlate API calls back to a specific agent's decision chain.
SIEM ingests log events, but the analytics architecture reveals the gap. The AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS type enforces that logged metadata contains no code or file paths. Right for privacy. Wrong for security observability.
These tools are not failing. They are observing the wrong layer.
Section 2: The Visibility Gap
This is not partial visibility. This is structural blindness.
Agent Inventory
The question: How many agents are running right now? What spawned them?
Why you cannot answer: Agent IDs are local. Background, remote, and cron agents use different tracking mechanisms with no unified query. No mechanism reports active agents to a central inventory.
There is no "list agents" command.
Extension Surface
The question: What plugins, MCP servers, and hooks are active across the organization?
Why you cannot answer: Plugin loading is decentralized. MCP servers connect dynamically per session. Subagents add their own servers additively. The total surface is the union across all developer workstations, all sessions -- and no system aggregates it.
The attack surface is additive and unbounded.
Permission State
The question: What can each agent do?
Why you cannot answer: Effective permissions are computed at runtime from multiple overlapping sources -- tool arrays, disallowed tools, deny rules, permission modes, prompt-level behavioral constraints. These computations exist only in memory, per agent instance.
Permission is computed, not stored.
Tool Execution History
The question: What actions has each agent taken?
Why you cannot answer: The ProgressTracker maintains a 5-item rolling window. Transcripts are local files on developer machines, formatted for debugging. No centralized, tamper-evident record of agent actions exists.
There is no audit trail.
Cross-Agent Context Flow
The question: What information passes between agents?
Why you cannot answer: Context flows through function parameters and message passing, not an observable data plane. The teammate model adds named message routing between concurrent agents. None of these flows are logged for reconstruction.
Data movement is invisible by design.
Policy Compliance
The question: Are agents operating within approved boundaries?
Why you cannot answer: The policySettings source competes with other sources in a priority hierarchy. The isSourceAllowedByPolicy check runs at plugin load time on the local machine with no reporting to a central authority. No compliance verification mechanism exists.
Compliance cannot be verified.
Section 3: The Shadow Agent Problem
This is not emerging. It is already deployed.
| Cloud Shadow IT | Agent Shadow IT |
|---|---|
| Unmanaged AWS accounts | Unmanaged AI agent installs across workstations |
| Unauthorized SaaS apps | Unauthorized MCP servers and plugins |
| Unmonitored EC2 instances | Background agents running without awareness |
| Cloud sprawl with no inventory | Agent sprawl with no inventory |
| No CSPM = no visibility | No ASPM = no visibility |
| Ungoverned IAM | Ungoverned agent permissions |
| Shadow data stores | Agent context flows bypassing data governance |
Every developer workstation is now a potential agent runtime.
A single AI coding agent has access to the file system, the shell, the network, and API credentials. Agents can spawn subagents, connect to external services, persist scheduled tasks to disk, and maintain memory across sessions. This is not a theoretical attack surface. This is the operational reality.
Unlike cloud shadow IT, where infrastructure lived in identifiable data centers, agent shadow IT lives on laptops, in CI/CD pipelines, and in remote compute environments that may not be owned by the organization.
We Have Seen This Before
In cloud: infrastructure expanded faster than visibility. Security lagged behind deployment.
In agents: execution expands faster than visibility. Security is already behind.
The difference: this time, the systems are autonomous.
Cloud security matured through three phases: visibility (know what you have), posture (enforce how it should be configured), runtime protection (detect and respond). Agent security has not completed phase one.
Section 4: What Enterprises Need
These are not enhancements. They are prerequisites.
1. Agent Discovery
Detect and inventory every agent instance -- subagents, background agents, scheduled agents, remote agents.
Why it is hard: Agents are created dynamically, tracked in local memory, destroyed on completion. Multiple execution contexts (local, background, remote, teammate, cron) require different detection. If you cannot enumerate agents, you cannot secure them.
2. Extension Auditing
Catalog all plugins, MCP servers, hooks, and custom agents across every instance.
Why it is hard: Extensions aggregate from marketplaces, git repos, CLI flags, and local directories. MCP servers connect dynamically. The total surface changes every time someone installs a plugin or clones a repository.
3. Permission Visibility
Map effective permissions for each agent across all rule sources.
Why it is hard: Permissions are computed at runtime from overlapping sources per agent instance with no externalized state. The computation happens in memory and exists nowhere else.
4. Runtime Monitoring
Track tool execution, MCP interactions, spawning, and filesystem changes in real-time.
Why it is hard: Existing telemetry is product analytics, not security observability. The analytics pipeline explicitly excludes code and file paths. Security-grade monitoring requires a parallel observability layer.
5. Policy Enforcement
Ensure agents operate within approved boundaries.
Why it is hard: The current policy mechanism is a configuration source competing with other sources in a priority hierarchy. No out-of-band enforcement exists that the agent runtime cannot circumvent. True enforcement requires an external control plane.
6. Anomaly Detection
Identify unusual behavior patterns across all agent instances.
Why it is hard: Every agent session has a potentially unique behavioral profile given the diversity of types, tools, and MCP connections. Baselines require consistent monitoring across the organization over time.
The Path Forward
This is not optional. It is already happening.
Agent adoption is accelerating faster than any development tool in history. Every week without visibility is a week where the ungoverned surface grows -- more plugins, more MCP servers, more scheduled agents, more background sessions no security team monitors.
The expertise required sits at the intersection of AI systems knowledge and infrastructure security. Understanding spawning models, MCP protocols, plugin loading, permission cascades -- this requires deep analysis of production architectures, not theoretical modeling.
The only unknown is how long organizations operate without visibility.
The agents are already running.
They are spawning other agents. They are connecting to external systems. They are executing actions. They are persisting across sessions.
And you cannot see them.
That is not a future risk. That is your current environment.
Over the course of this series, we examined the execution loop, the control plane, the attack surface, the permission kernel, the guardrail system, the process model, the injection surfaces, the cloud security parallels, and the risk taxonomy. This final article addresses the question that underlies all of them: can you even see these systems?
The answer, today, is no.
The organizations that build visibility now will define agent security for the next decade. The rest will be catching up.
Series Complete: Anatomy of a Production AI Agent
This is the final article in the 10-part series. Explore the full series:
Scott Thornton is an AI security researcher at perfecXion.ai, specializing in defensive research on LLM and agent vulnerabilities. All analysis was conducted on lawfully obtained, publicly distributed npm package code in an authorized research environment.