Skip to main content

Contop

Remote AI agent that sees your screen and operates your computer. Speak a command or type a message — an autonomous agent observes your screen, runs commands, clicks buttons, fills forms, and reports back in real time.

How It Works

Contop uses a three-node architecture connected over encrypted WebRTC:

  1. Mobile Client (React Native / Expo) — Voice and text input, execution thread UI, live remote screen feed
  2. Contop Server (Python / FastAPI) — ADK execution agent, 30+ tools, security evaluation, WebRTC signaling
  3. Desktop Host (Tauri v2 / Rust) — Native app shell, server lifecycle management, Away Mode protection

Explore the Docs

Key Features

Multimodal AI Agent

Voice commands via configurable STT (Google STT default), screen understanding via 9 vision backends, autonomous multi-step execution with the Google ADK.

Security-First Design

Every command classified by the Dual-Tool Evaluator. Dangerous commands sandboxed in Docker. Destructive actions require explicit approval.

Zero-Config Networking

QR code pairing, automatic Cloudflare Tunnel fallback, WebRTC peer-to-peer with DTLS encryption. No port forwarding or VPN required.

Multi-Provider LLM

Gemini, OpenAI, Anthropic, Groq, Mistral, Together AI, DeepSeek, and more — switch models without code changes.

Hybrid Control

Seamlessly switch between AI execution and manual remote control with virtual joystick, tap-to-click, and keyboard grid overlay.

Away Mode

Lock your desktop when unattended with PIN protection, keyboard blocking, and a secure overlay window (Windows).