Dual-Tool Evaluator
The Dual-Tool Evaluator is the security gate that classifies every tool call before execution. It determines whether a command runs on the host, in a Docker sandbox, or requires user confirmation.
Classification Cascade
The evaluator's classify() method processes every tool call through a multi-step cascade, stopping at the first match:
Step 1: Force Host Override
If the user has explicitly approved a command via the UI confirmation modal, it runs on the host regardless of other rules.
Result: host
Step 2: Known Tool Routing
Each tool category is routed based on its nature:
Display-dependent tools — always host (require physical display access):
execute_gui,observe_screen,get_ui_context,execute_accessibleexecute_browser,execute_computer_usemaximize_active_window,wait,get_action_history
Skill tools — always host:
execute_skill,load_skill,create_skill,edit_skill- Dynamically registered skill Python tools
File tools — host with restricted path checking:
read_file,edit_file,find_files
Window & clipboard tools — always host:
window_list,window_focus,resize_window,clipboard_read,clipboard_write
Document tools — always host:
read_pdf,read_image,read_excel,write_excel
System tools — always host:
process_info,system_info,download_file
Workflow tools — always host:
save_dialog,open_dialog,launch_app,close_app,app_menu,install_app,copy_between_apps,fill_form,extract_text,set_env_var,change_setting,find_and_replace_in_files
Step 3: Unknown Tool Names
Any tool name not recognized by the evaluator is routed to the sandbox as a defense-in-depth measure.
Result: sandbox
Step 4: CLI Command Checks
For execute_cli, commands go through sub-checks in order:
4a. Forbidden commands — Commands matching the forbidden_commands list are always sandboxed. Uses substring and word boundary matching.
Example forbidden commands:
rm -rf /mkfsdd if=format C:
Result: sandbox
4b. Restricted paths — Commands targeting paths in the restricted_paths list are sandboxed.
- Windows: Case-insensitive path matching
- Unix: Case-sensitive path matching
Example restricted paths:
/root,/etc/shadow,/etc/passwdC:\Windows,C:\Windows\System32
Result: sandbox
Step 5: Destructive Command Check
Commands matching destructive_patterns run on the host but require explicit user confirmation first.
Example destructive patterns:
- File deletion:
rm,rmdir,del,erase - Process management:
kill,killall,pkill,taskkill - System control:
shutdown,reboot,poweroff - Database:
DROP TABLE,DROP DATABASE,TRUNCATE
The evaluator uses pre-compiled regex patterns (_destructive_re_cache) for performance.
Result: host with require_confirmation=True
Step 6: Default
If no previous step matched, the command is considered safe.
Result: host
PowerShell Security Floor
14 PowerShell destructive cmdlets have a hardcoded security floor that cannot be overridden via settings:
Remove-Item,Move-Item,Stop-ProcessRestart-Computer,Stop-ComputerClear-Content,Clear-Item,Set-ContentRemove-ItemProperty,Stop-Service,Remove-ServiceInvoke-Expression(iex),Format-Volume
These always trigger the destructive confirmation flow regardless of user configuration.
Encoded PowerShell Detection
The evaluator detects Base64-encoded PowerShell commands (-EncodedCommand flag) and blocks them from running on the host — a common technique for bypassing command-line pattern matching.
Subshell Content Extraction
Commands containing subshells ($() and backtick expressions) have their inner content extracted and evaluated separately. This prevents circumventing the evaluator by wrapping dangerous commands in subshells.
Command Prefix Stripping
Common command prefixes are stripped before evaluation:
sudo,env,nohup,nice,time,doas
This ensures sudo rm -rf / is evaluated the same as rm -rf /.
Confirmation Flow
When a command is classified as destructive (require_confirmation=True):
- Server sends
agent_confirmation_requestto mobile with the command and reason - Agent execution pauses
- User taps Approve or Deny on their phone
- If approved: command executes on host
- If denied: command is skipped, agent continues with next step
- If no response within timeout: automatically rejected
Performance
The classification target is under 100ms per evaluation — the evaluator should never be the bottleneck in the execution pipeline.
Related: Docker Sandbox · Tool Layers · Agent Execution