AI panel chat: terminal tool

The chat surface has a terminal-style tool, run_command, that lets the model execute a shell command in the project root when no structured tool fits the task. Issue #163, item F.

When the model uses it

The chat AI is steered to prefer structured tools over the terminal:

  • read_file over cat
  • search_project over grep
  • list_files over ls
  • propose_edits over sed

run_command is the last resort — useful for things like running git status, scaffolding via a CLI, executing the project’s test runner, or asking a build tool to regenerate a derived file. The model is told in its system prompt to surface a clear reason to the writer before reaching for it.

Confirmation flow

Every run_command call opens the Run command? modal showing:

  • The exact command line the model wants to run
  • A short hint about what gets fed back (stdout, stderr, exit code, duration)

Run executes; Cancel declines and feeds “User declined to run this command.” back to the model. Esc declines; Cmd/Ctrl-Enter approves. Cancelling the chat call mid-prompt closes the modal cleanly — no button is left on screen with no recipient.

Hard limits

  • 60 seconds wall-clock timeout. The runner kills the process at the cap and reports command timed out.
  • 16,000 characters per stream. stdout and stderr are each capped independently and tagged […truncated] if they hit the limit.
  • CWD is pinned to the project root — run_command cannot escape via cd /elsewhere && … for the duration of the command (a single cd inside the shell line is fine; subsequent commands in the same shell line inherit it).

The shell:

  • Windows → cmd.exe /d /s /c <command>
  • macOS / Linux → /bin/sh -c <command>

What it surfaces in chat

When the model invokes the tool, the chat shows a tool-activity card with the (truncated) command line. After the run settles, the card flips to exit 0 · 412ms (or failed: <message> / timeout · 60000ms). The full output goes back to the model so it can read it and continue the conversation. The chat container does not display the raw output — that’s deliberate; the writer reads the model’s interpretation rather than scrolling 16k chars of build noise.

Cost notes

The runner itself doesn’t call any AI model — it just spawns a child process. The model pays normal token cost for receiving the formatted output back, which is why we cap stdout/stderr to 16k chars: a runaway test run would otherwise blow the prompt budget on the next iteration.