Native desktop automation for agents

Observe. Decide. Act across real desktop apps.

agent-desktop gives AI agents structured access to macOS, Windows, and Linux applications through accessibility trees, deterministic element refs, and machine-readable command results.

Why agents need it

Desktop control should be structured, recoverable, and cheap in tokens.

Accessibility Trees

Work with semantic UI structure instead of brittle screenshots and pixel matching.

Deterministic Refs

Use stable element references and snapshot IDs so agents can act and then verify state.

Progressive Traversal

Inspect shallow maps first, then drill into regions to reduce context use in dense apps.

Best fit

For builders making agents that operate outside the browser.