Skip to content

Your terminal is a programmable screen. Read what’s on it, press keys into it, wait for it to settle — all through HTTP endpoints that understand the terminal’s actual rendering state.

Traditional terminal automation sends raw bytes and hopes for the best. hoody-terminal takes a different approach: it maintains a server-side VT parser (libvterm) that mirrors the exact screen state a human would see. Every automation endpoint operates on this parsed screen, giving you deterministic control over full-screen TUI applications like vim, htop, tmux, and any interactive program.

Inspired by tui-use, but built server-side in C with libvterm. No client-side dependencies. No browser needed. Just HTTP.


Traditional terminal scripting (/api/v1/terminal/execute) works great for commands that produce output and exit. But what about programs that take over the screen?

  • vim — you need to navigate, type, save, quit
  • htop — you need to select processes, sort columns, send signals
  • python3 — you need to feed expressions and read results from a REPL
  • ssh — you need to wait for a password prompt, type credentials, then wait for the shell
  • tmux — you need to create panes, switch windows, send commands to specific panes

These programs don’t write to stdout/stderr. They draw on the terminal screen using escape sequences. The automation endpoints let you interact with them the way a human would: by reading the screen and pressing keys.

Throughout this page, all examples use a $TERMINAL variable for the base URL:

Terminal window
TERMINAL="https://$PROJECT-$CONTAINER-terminal-1.$SERVER.containers.hoody.icu"

Set this once and every curl/fetch example below works as-is.


Here’s a complete workflow: launch Python, run a calculation, read the result.

Step 1: Start a Python REPL

Terminal window
# Start python3 (don't wait -- it takes over the screen)
curl -X POST "$TERMINAL/api/v1/terminal/execute" \
-H "Content-Type: application/json" \
-d '{"command": "python3", "wait": false}'

Step 2: Wait for the Python prompt

Terminal window
# Wait until ">>>" appears on screen
curl -X POST "$TERMINAL/api/v1/terminal/wait?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"mode": "regex", "pattern": ">>> $", "timeout_ms": 5000}'

Step 3: Type a calculation and press Enter

Terminal window
# Paste the expression
curl -X POST "$TERMINAL/api/v1/terminal/paste?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"text": "2 ** 256"}'
# Press Enter
curl -X POST "$TERMINAL/api/v1/terminal/press?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"key": "enter"}'

Step 4: Wait for the result and read the screen

Terminal window
# Wait for the next prompt (means the result has been printed)
curl -X POST "$TERMINAL/api/v1/terminal/wait?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"mode": "regex", "pattern": ">>> $", "timeout_ms": 5000}'
# Read the screen
curl "$TERMINAL/api/v1/terminal/snapshot?terminal_id=1"

All automation endpoints live under /api/v1/terminal/ and require a terminal_id parameter (query string or URL path).

EndpointMethodDescription
/api/v1/terminal/snapshotGETRendered viewport: lines, cursor, title, fullscreen state, highlights, sequence counter
/api/v1/terminal/findGETPCRE2 regex search on rendered screen with cell-coordinate hits
/api/v1/terminal/pressPOSTSend named key presses (mode-aware: respects DECCKM/DECKPAM)
/api/v1/terminal/writePOSTRaw byte injection — escape hatch when /press and /paste don’t fit
/api/v1/terminal/pastePOSTBracketed paste with full UTF-8 support
/api/v1/terminal/waitPOSTAsync wait until stable/regex-match/either, returns atomic snapshot

Full API reference: Terminal API Reference —> for complete OpenAPI docs with all parameters, response schemas, and error codes.


GET /api/v1/terminal/snapshot

Returns the terminal screen exactly as a human would see it: a grid of text lines, the cursor position, the window title, and whether the program is in fullscreen (alt-screen) mode.

ParameterTypeDefaultDescription
terminal_idstringrequiredTerminal session ID (1-65535)
include_colorsbooleanfalseInclude ANSI SGR colored_lines array alongside plain text
include_highlightsbooleantrueInclude reverse-video highlight spans
scroll_offsetinteger0Lines into scrollback (0 = live viewport)
Terminal window
# Basic snapshot
curl "$TERMINAL/api/v1/terminal/snapshot?terminal_id=1"
# With colors and scrollback
curl "$TERMINAL/api/v1/terminal/snapshot?terminal_id=1&include_colors=true&scroll_offset=10"
{
"terminal_id": "1",
"cols": 80,
"rows": 24,
"lines": [
"$ ls -la",
"total 16",
"drwxr-xr-x 3 user user 4096 .",
""
],
"cursor": {
"row": 2,
"col": 0,
"visible": true
},
"title": "bash",
"is_fullscreen": false,
"scroll_offset": 0,
"seq": 42,
"highlights": [
{ "row": 0, "col": 2, "length": 5 }
]
}

Key fields:

  • lines — Array of strings, one per visible row. Trailing whitespace is trimmed. Empty rows appear as empty strings.
  • cursor — Row/col position (0-indexed) and visibility. Programs like vim move the cursor; shell prompts park it at the input position.
  • is_fullscreentrue when the program has switched to the alternate screen buffer (vim, htop, less, tmux). Useful for knowing whether you’re in a TUI or at a shell prompt.
  • seq — Monotonic sequence counter. Increments on every screen update. Use this to detect whether the screen has changed between two snapshots without comparing all lines.
  • highlights — Reverse-video spans (used by search highlights, selection, etc.). Each entry has row, col, length.

Set scroll_offset to read lines that have scrolled off the top of the screen. A value of 10 means “show me what was on screen 10 lines ago.” The viewport is rows lines tall, so scroll_offset=24 on an 80x24 terminal shows the previous full page.

The scrollback buffer holds up to 500 lines by default (configurable with --vterm-scrollback-lines, max 10000).


GET /api/v1/terminal/find

Search the rendered terminal screen for a PCRE2 regular expression. Returns cell-coordinate hits with matched text — useful for locating specific output, error messages, or UI elements on the screen.

ParameterTypeDefaultDescription
terminal_idstringrequiredTerminal session ID
patternstringrequiredPCRE2 regex pattern (max 1024 bytes)
scopestring"screen"Where to search: screen, scrollback, or all
limitinteger100Max hits to return (max 1000)
case_insensitivebooleanfalseCase-insensitive matching
scroll_offsetinteger0Scrollback offset for screen scope (0 = live viewport).
Terminal window
# Find all error messages on screen
curl "$TERMINAL/api/v1/terminal/find?terminal_id=1&pattern=error&case_insensitive=true"
# Find IP addresses in scrollback
# (backslashes doubled because the URL is in double quotes -- shell would otherwise eat single backslashes)
curl "$TERMINAL/api/v1/terminal/find?terminal_id=1&pattern=\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}&scope=all"
# Find the shell prompt
curl "$TERMINAL/api/v1/terminal/find?terminal_id=1&pattern=%24%20%24"
# (URL-encoded: "$ $" -- dollar, space, end of line)
{
"pattern": "error",
"scope": "screen",
"hits": [
{ "row": 5, "col": 12, "length": 5, "text": "error" },
{ "row": 8, "col": 0, "length": 5, "text": "Error" }
],
"total": 2,
"truncated": false
}

Key fields:

  • hits — Array of matches with cell coordinates (0-indexed row/col), length in characters, and matched text.
  • truncatedtrue if the number of hits reached the limit. Increase limit or narrow your pattern.
  • scope — Echoes back which scope was searched.
ScopeDescription
screenVisible viewport only (default). Fast, covers what a user would see.
scrollbackOnly the scrollback buffer (lines that scrolled off the top).
allBoth screen and scrollback. Use when you’re not sure where the match is.

POST /api/v1/terminal/press

Send named key presses to the terminal. Keys are encoded through libvterm’s keyboard API, which respects the terminal’s current mode (DECCKM for cursor keys, DECKPAM for keypad). This means arrow keys, function keys, and ctrl sequences automatically generate the correct byte sequences for whatever program is running.

{
"key": "enter"
}

Or send multiple keys in sequence:

{
"keys": ["escape", ":", "w", "q", "enter"]
}
ParameterTypeDescription
terminal_idstring (query)Terminal session ID (required)
keystring (body)Single key name
keysstring[] (body)Array of key names to press in sequence (max 256)
Terminal window
# Press Enter
curl -X POST "$TERMINAL/api/v1/terminal/press?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"key": "enter"}'
# Press Ctrl+C (interrupt)
curl -X POST "$TERMINAL/api/v1/terminal/press?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"key": "ctrl+c"}'
# Type ":wq" and press Enter (save and quit vim)
curl -X POST "$TERMINAL/api/v1/terminal/press?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"keys": ["escape", ":", "w", "q", "enter"]}'
# Navigate with arrow keys
curl -X POST "$TERMINAL/api/v1/terminal/press?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"keys": ["arrow_down", "arrow_down", "arrow_down", "enter"]}'
{
"status": "ok",
"bytes_written": 7
}

All key names are case-insensitive. Many keys have aliases for convenience.

Special Keys:

KeyAliasesDescription
enterreturn, cr, ctrl+m, c-mEnter/Return key
tabctrl+i, c-iTab key
escapeesc, ctrl+[, c-[Escape key
backspacebs, ctrl+h, c-hBackspace key
spaceSpace bar
backtabshift+tab, s-tabShift+Tab (reverse tab)

Arrow Keys:

KeyAliasesDescription
arrow_upupUp arrow
arrow_downdownDown arrow
arrow_leftleftLeft arrow
arrow_rightrightRight arrow

Navigation:

KeyAliasesDescription
homeHome key
endEnd key
page_uppgup, pageupPage Up
page_downpgdn, pagedownPage Down
insertinsInsert key
deletedelDelete key

Function Keys:

KeyDescription
f1 through f12Function keys F1-F12

Ctrl Combinations:

KeyAliasesByteDescription
ctrl+a through ctrl+zc-a through c-z0x01-0x1ACtrl+letter. ctrl+h -> backspace, ctrl+i -> tab, ctrl+m -> enter
ctrl+spacec-space, ctrl+@, c-@0x00NUL byte
ctrl+jc-j0x0ARaw line feed (LF) — distinct from enter which may send CR
ctrl+\\c-\\0x1CSIGQUIT in shell
ctrl+]c-]0x1DCtrl+Right bracket
ctrl+^c-^0x1ECtrl+Caret
ctrl+_c-_0x1FCtrl+Underscore
ctrl+?c-?0x7FDEL character

Modified Keys:

KeyDescription
shift+arrow_up/down/left/rightShift+Arrow (text selection in some programs)
ctrl+arrow_up/down/left/rightCtrl+Arrow (word navigation in some programs)
alt+enterAlt+Enter
alt+backspaceAlt+Backspace (delete word in zsh/bash)

Single Characters:

Any single printable ASCII character (! through ~, plus space) can be used as a key name. For example, {"key": "a"} presses the letter “a”, {"key": "!"} presses exclamation mark.


POST /api/v1/terminal/paste

Paste text into the terminal with optional bracketed paste mode. This is the preferred way to send multi-character text (commands, code snippets, file content) into the terminal.

{
"text": "echo 'Hello, World!'",
"bracketed": true
}
ParameterTypeDefaultDescription
terminal_idstring (query)requiredTerminal session ID
textstring (body)requiredText to paste (UTF-8)
bracketedboolean (body)trueUse bracketed paste mode if the program supports it
Terminal window
# Paste a command (bracketed paste protects against auto-indent)
curl -X POST "$TERMINAL/api/v1/terminal/paste?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"text": "echo Hello, World!"}'
# Paste multi-line code into vim
curl -X POST "$TERMINAL/api/v1/terminal/paste?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"text": "def hello():\n print(\"Hello!\")\n\nhello()"}'
# Paste without bracketed mode (raw keystrokes)
curl -X POST "$TERMINAL/api/v1/terminal/paste?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"text": "ls -la", "bracketed": false}'
{
"status": "ok",
"bytes_written": 42,
"bracketed_active": true
}

Key fields:

  • bytes_written — Number of bytes written to the terminal PTY.
  • bracketed_activetrue if the program had DECSET 2004 enabled and bracketed paste markers were actually sent. false if the program doesn’t support bracketed paste (the text was still sent, just without markers).
UsePastePress
Multi-character textPreferred — single HTTP call, bracketed paste protectionWorks but requires one key per character
Special keysCannot send Enter, Escape, Ctrl+C, etc.Designed for this
Code with newlinesHandles \n correctly with bracketed pasteEach line would need separate Enter presses
UTF-8 / emoji / CJKFull supportSingle printable ASCII only
SpeedFast — single writeSequential key-by-key

When bracketed is true (default), the text is wrapped in escape sequences (\e[200~\e[201~) if the running program has opted in via DECSET 2004. Most modern programs support this:

  • zsh, bash (readline), fish — Yes. Prevents auto-execution of pasted newlines.
  • vim, neovim — Yes. Prevents auto-indent mangling of pasted code.
  • python REPL — No (unless using IPython). Text is pasted as raw keystrokes.
  • htop, top — No. These aren’t text input programs.

When bracketed is false, the text is sent as raw keystrokes regardless of the program’s paste mode setting.


POST /api/v1/terminal/write

/write is the raw-byte escape hatch for terminal automation. It injects bytes directly into the session’s PTY master fd, exactly as if typed at a physical keyboard. Use it when you need to send escape sequences the /press key table doesn’t cover, or when you want byte-level control over what hits the shell.

{
"input": "y",
"enter": true
}
ParameterTypeDefaultDescription
terminal_idstring (query)requiredTerminal session ID
inputstring (body)requiredText to type (UTF-8). Empty string is valid — sends just an Enter if enter=true.
enterboolean (body)trueAuto-append a newline after input. Set to false for raw-keystroke input.
Terminal window
# Answer a y/n prompt
curl -X POST "$TERMINAL/api/v1/terminal/write?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"input": "y"}'
# Send raw bytes without an auto-Enter
curl -X POST "$TERMINAL/api/v1/terminal/write?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"input": "[6~", "enter": false}'
# Just press Enter
curl -X POST "$TERMINAL/api/v1/terminal/write?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"input": ""}'
{ "success": true, "terminal_id": "1", "bytes_written": 2 }

POST /api/v1/terminal/wait

Block until a condition is met, then return an atomic snapshot of the screen at the exact moment of resolution. This is the key endpoint that eliminates sleep-polling from terminal automation scripts.

The word “atomic” matters here: the snapshot is captured at the same instant the condition resolves. If you called /wait and /snapshot as two separate requests, the screen could change between them (a TOCTOU race). With /wait, the snapshot in the response is guaranteed to reflect the state that matched your condition.

{
"mode": "regex",
"pattern": "\\$ $",
"timeout_ms": 10000,
"debounce_ms": 100
}
ParameterTypeDefaultDescription
terminal_idstring (query)requiredTerminal session ID
modestring (body)"stable"Wait mode: stable, regex, or either
patternstring (body)PCRE2 regex (required for regex and either modes, max 1024 bytes)
timeout_msinteger (body)5000Hard deadline in ms (10-300000)
debounce_msinteger (body)100Stable mode debounce in ms (10-60000)
search_scopestring (body)"screen"Where to search: screen, scrollback, or all
include_colorsboolean (body)falseInclude colored_lines in snapshot
include_highlightsboolean (body)trueInclude highlights in snapshot

stable — Wait until the screen stops changing. The endpoint watches the terminal’s sequence counter (seq) and resolves when no screen updates arrive for debounce_ms consecutive milliseconds. Think of debounce_ms as “how long must the screen be quiet before I consider it settled.” Use this when you don’t know what the output will look like, but you know the program will eventually stop printing.

Terminal window
# Wait until output settles (500ms of quiet)
curl -X POST "$TERMINAL/api/v1/terminal/wait?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"mode": "stable", "debounce_ms": 500, "timeout_ms": 30000}'

regex — Wait until a PCRE2 pattern matches on the screen. Resolves the instant the match appears, returning the match coordinates alongside the snapshot.

Terminal window
# Wait for shell prompt
curl -X POST "$TERMINAL/api/v1/terminal/wait?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"mode": "regex", "pattern": "\\$ $", "timeout_ms": 10000}'
# Wait for a specific error message
curl -X POST "$TERMINAL/api/v1/terminal/wait?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"mode": "regex", "pattern": "BUILD (SUCCESS|FAILED)", "timeout_ms": 60000}'

either — First condition wins: resolves on regex match OR stability, whichever comes first. Useful when you’re not sure if the program will produce a specific prompt or just stop outputting.

Terminal window
# Wait for either a prompt or 2 seconds of quiet
curl -X POST "$TERMINAL/api/v1/terminal/wait?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"mode": "either", "pattern": "[\\$#>] $", "debounce_ms": 2000, "timeout_ms": 30000}'
{
"status": "matched",
"elapsed_ms": 423,
"match": {
"row": 10,
"col": 2,
"length": 2,
"text": "$ "
},
"snapshot": {
"terminal_id": "1",
"cols": 80,
"rows": 24,
"lines": ["..."],
"cursor": { "row": 10, "col": 4, "visible": true },
"title": "bash",
"is_fullscreen": false,
"seq": 42
}
}

Status values:

StatusMeaning
matchedRegex pattern matched on screen
stableScreen was stable for debounce_ms (no regex match in either mode)
timeoutNeither condition met before timeout_ms
exitedUnderlying process died mid-wait. Includes snapshot.
vterm_reinitVT parser was torn down and re-initialized mid-wait (memory-cap resize). Client should retry; no match or snapshot returned.

Always check all five statuses; treating only matched/stable as success and ignoring exited/vterm_reinit can cause silent failures.


Every automation endpoint returns standard HTTP status codes. Handle these in your scripts to build robust automation.

CodeMeaningWhen It Happens
200SuccessRequest completed normally
400Bad requestInvalid key name, malformed regex, missing required parameter, body too large
404Session not foundThe terminal_id doesn’t exist or the session has been terminated
429Too many waitersMore than 16 concurrent /wait requests on the same session
503Resource exhaustedlibvterm memory cap exceeded — too many concurrent automation sessions

All errors return a JSON body with an error field and a human-readable message:

{
"error": "invalid_key",
"message": "Unknown key name 'crtl+c'. Did you mean 'ctrl+c'?",
"supported_keys": ["enter", "tab", "escape", "..."]
}
Terminal window
# Check HTTP status code with -w
HTTP_CODE=$(curl -s -o /tmp/resp.json -w '%{http_code}' \
-X POST "$TERMINAL/api/v1/terminal/press?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"key": "enter"}')
case "$HTTP_CODE" in
200) echo "Key sent" ;;
400) echo "Bad request: $(cat /tmp/resp.json)" ;;
404) echo "Session does not exist" ;;
429) echo "Too many waiters -- wait for existing ones to resolve" ;;
503) echo "Server overloaded -- back off and retry" ;;
esac

Reusable building blocks that show up in most automation scripts. Copy these into your projects.

If you need to type text character-by-character instead of pasting (some TUI programs don’t support paste), split the string into individual key presses:

Terminal window
# Type "hello" key by key
curl -X POST "$TERMINAL/api/v1/terminal/press?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"keys": ["h", "e", "l", "l", "o"]}'

The most common pattern: paste a command, press Enter, wait for the prompt, read the output.

Terminal window
# Paste command, press Enter, wait for prompt, read screen
curl -X POST "$TERMINAL/api/v1/terminal/paste?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"text": "uname -a"}'
curl -X POST "$TERMINAL/api/v1/terminal/press?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"key": "enter"}'
RESULT=$(curl -s -X POST "$TERMINAL/api/v1/terminal/wait?terminal_id=1" \
-H "Content-Type: application/json" \
-d '{"mode": "regex", "pattern": "\\$ $", "timeout_ms": 10000}')
echo "$RESULT" | jq -r '.snapshot.lines[]'

Use the is_fullscreen field from a snapshot to detect whether a full-screen program (vim, htop, less, tmux) is currently active:

Terminal window
# Check if we're in a TUI or at a shell prompt
IS_TUI=$(curl -s "$TERMINAL/api/v1/terminal/snapshot?terminal_id=1" | jq '.is_fullscreen')
if [ "$IS_TUI" = "true" ]; then
echo "A full-screen program is running -- press 'q' or Ctrl+C to exit first"
else
echo "At shell prompt -- safe to run commands"
fi

For multi-step interactions (wizards, installers, interactive prompts), use a loop that waits for a condition, inspects the screen, and decides the next action. This example assumes wait, paste, typeString, and press helpers like those defined above:

// Drive an interactive installer step by step
const steps = [
{ wait: 'Accept license\\?', action: async () => { await typeString('yes'); await press('enter'); } },
{ wait: 'Install directory', action: async () => { await paste('/opt/app'); await press('enter'); } },
{ wait: 'Confirm\\?', action: async () => press('enter') },
{ wait: '\\$ $', action: null } // done -- back at shell
];
for (const step of steps) {
const { status } = await wait({ mode: 'regex', pattern: step.wait, timeout_ms: 30000 });
if (status === 'timeout') throw new Error(`Timed out waiting for: ${step.wait}`);
if (step.action) await step.action();
}

Terminal window
TID="terminal_id=1"
# Open a file in vim
curl -X POST "$TERMINAL/api/v1/terminal/execute" \
-H "Content-Type: application/json" \
-d '{"command": "vim /tmp/hello.py", "wait": false}'
# Wait for vim to load (alt-screen active)
curl -X POST "$TERMINAL/api/v1/terminal/wait?$TID" \
-H "Content-Type: application/json" \
-d '{"mode": "stable", "debounce_ms": 300, "timeout_ms": 5000}'
# Enter insert mode
curl -X POST "$TERMINAL/api/v1/terminal/press?$TID" \
-H "Content-Type: application/json" \
-d '{"key": "i"}'
# Type some code
curl -X POST "$TERMINAL/api/v1/terminal/paste?$TID" \
-H "Content-Type: application/json" \
-d '{"text": "#!/usr/bin/env python3\nprint(\"Hello from vim automation!\")\n"}'
# Exit insert mode, save, and quit
curl -X POST "$TERMINAL/api/v1/terminal/press?$TID" \
-H "Content-Type: application/json" \
-d '{"keys": ["escape", ":", "w", "q", "enter"]}'
# Wait for vim to close (back to shell prompt)
curl -X POST "$TERMINAL/api/v1/terminal/wait?$TID" \
-H "Content-Type: application/json" \
-d '{"mode": "regex", "pattern": "\\$ $", "timeout_ms": 5000}'

2. Navigate htop: Filter and Kill a Process

Section titled “2. Navigate htop: Filter and Kill a Process”
// Launch htop and wait for it to render
await fetch(`${TERMINAL}/api/v1/terminal/execute`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ command: 'htop', wait: false })
});
await wait({ mode: 'stable', debounce_ms: 500, timeout_ms: 5000 });
// Filter for "node" processes (F4 opens htop's filter)
await press('f4');
await paste('node');
await wait({ mode: 'stable', debounce_ms: 300, timeout_ms: 3000 });
// Read the filtered list
const { snapshot } = await wait({ mode: 'stable', debounce_ms: 200, timeout_ms: 2000 });
console.log('Filtered:', snapshot.lines.filter(l => l.includes('node')));
// Send SIGTERM to selected process (F9), then quit
await press(['f9', 'enter']);
await press('q');

3. Python REPL: Define a Function and Call It

Section titled “3. Python REPL: Define a Function and Call It”

This extends the Quick Start pattern with a multi-line function definition. The key trick: paste with bracketed: false for REPLs that don’t support bracketed paste, and press Enter twice to end an indented block.

// Start Python and wait for prompt (see Quick Start for the curl equivalent)
await fetch(`${TERMINAL}/api/v1/terminal/execute`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ command: 'python3', wait: false })
});
await wait({ mode: 'regex', pattern: '>>> $', timeout_ms: 5000 });
// Paste a multi-line function (bracketed: false for Python's REPL)
await paste('def fib(n):\n a, b = 0, 1\n for _ in range(n):\n a, b = b, a + b\n return a\n');
await press(['enter', 'enter']); // two Enters to close the block
await wait({ mode: 'regex', pattern: '>>> $', timeout_ms: 5000 });
// Call the function and read the result
await paste('fib(100)');
await press('enter');
const { snapshot } = await wait({ mode: 'regex', pattern: '>>> $', timeout_ms: 5000 });
// Find the line containing the big number
const resultLine = snapshot.lines.find(l => /^\d{10,}$/.test(l.trim()));
console.log('fib(100) =', resultLine?.trim());
// fib(100) = 354224848179261915075
// Exit Python
await press('ctrl+d');

SSH is a classic multi-step interactive flow: wait for host key confirmation or password prompt, respond, then wait for the remote shell.

// Start SSH (don't wait -- it takes over the screen)
await fetch(`${TERMINAL}/api/v1/terminal/execute`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ command: 'ssh user@remote-server.example.com', wait: false })
});
// Wait for password prompt or host key confirmation
const { match } = await wait({
mode: 'regex',
pattern: '(password:|yes/no)',
timeout_ms: 15000
});
if (match?.text.includes('yes/no')) {
await paste('yes');
await press('enter');
await wait({ mode: 'regex', pattern: 'password:', timeout_ms: 10000 });
}
// Type password and wait for remote shell prompt
await paste('my-password');
await press('enter');
const result = await wait({ mode: 'regex', pattern: '[\\$#] $', timeout_ms: 15000 });
console.log('Connected!', result.snapshot.lines[result.snapshot.cursor.row]);

5. Run Command and Extract Output with /find

Section titled “5. Run Command and Extract Output with /find”

Combines the “run command and wait” pattern from Common Patterns with /find to extract structured data:

// Run df -h and wait for completion
await paste('df -h /');
await press('enter');
const { snapshot } = await wait({ mode: 'regex', pattern: '\\$ $', timeout_ms: 5000 });
// Use /find to extract the disk usage percentage from the screen
const findRes = await fetch(
`${TERMINAL}/api/v1/terminal/find?${TID}&pattern=\\d+%25` // %25 = URL-encoded %
).then(r => r.json());
console.log('Disk usage:', findRes.hits[0]?.text); // "42%"
// Or parse directly from the snapshot lines
const dfLine = snapshot.lines.find(l => l.includes('/dev/'));

tmux uses a prefix key (Ctrl+B by default) followed by a command key. The /press endpoint handles this naturally since it sends keys sequentially:

// Start tmux
await fetch(`${TERMINAL}/api/v1/terminal/execute`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ command: 'tmux new-session -s auto', wait: false })
});
await wait({ mode: 'stable', debounce_ms: 500, timeout_ms: 5000 });
// Split horizontally (Ctrl+B, then %)
await press(['ctrl+b', '%']);
await wait({ mode: 'stable', debounce_ms: 300, timeout_ms: 3000 });
// Run a command in the new pane
await paste('watch -n1 date');
await press('enter');
// Switch back to first pane
await press(['ctrl+b', 'arrow_left']);

The "either" mode shines here: match on a known success/failure message OR fall back to stability if the output is unexpected. Set a generous timeout_ms — the wait returns instantly when the condition is met.

// Start build
await paste('npm run build 2>&1');
await press('enter');
// Wait up to 5 minutes for completion
const { status, match } = await wait({
mode: 'either',
pattern: '(Build succeeded|ERROR|FAILED|\\$ $)',
debounce_ms: 3000,
timeout_ms: 300000
});
if (status === 'matched' && match.text.includes('ERROR')) {
// Use /find to collect all error lines (including scrollback)
const errors = await fetch(
`${TERMINAL}/api/v1/terminal/find?${TID}&pattern=ERROR.*&scope=all`
).then(r => r.json());
console.error('Build failed with', errors.total, 'errors');
} else {
console.log('Build succeeded!');
}

One of the most important features of /press is that it generates correct escape sequences for the terminal’s current mode. This is why it uses libvterm internally instead of sending raw bytes.

When you send arrow keys to a terminal, the correct byte sequence depends on the terminal’s mode:

KeyNormal Mode (DECCKM off)Application Mode (DECCKM on)
Arrow Up\e[A\eOA
Arrow Down\e[B\eOB
Arrow Left\e[D\eOD
Arrow Right\e[C\eOC
Home\e[H\eOH
End\e[F\eOF

Programs like vim, htop, and less enable application cursor mode (DECCKM). If you send the wrong byte sequence, the program ignores the key or does something unexpected.

Similarly, the numeric keypad has two modes: numeric (sends digits) and application (DECKPAM, sends \eO sequences). Programs use this for cursor navigation in TUI menus.

When you call /press, the endpoint:

  1. Looks up the key name in the key table
  2. Calls libvterm’s keyboard API (vterm_keyboard_key or vterm_keyboard_unichar) with the appropriate VTermKey enum and modifiers
  3. libvterm checks the terminal’s current DECCKM/DECKPAM/DECNKM state and generates the correct byte sequence
  4. The generated bytes are drained from libvterm’s output buffer and written to the terminal PTY

This means the same /press call generates different byte sequences depending on what program is running. You don’t need to know or care about terminal modes — just send arrow_up and it works in vim, htop, bash, tmux, or any other program.


The automation subsystem maintains a server-side libvterm instance for each terminal session that uses automation endpoints.

libvterm instances are created on demand when the first automation endpoint is called for a session. Sessions that never use automation have zero memory overhead.

On first use, the session’s output buffer is replayed through libvterm to reconstruct the full terminal state (screen content, cursor position, modes, colors). This is how /press knows the correct byte sequences — libvterm tracks the terminal’s mode from the replayed output. The replay is fast (native C code) but does consume CPU proportional to the output buffer size. Subsequent calls use the already-initialized instance, kept in sync by feeding new output as it arrives.

All libvterm instances share a global memory cap (default: 512 MB, configurable with --vterm-memory-cap-mb). Each instance consumes memory proportional to cols * rows * cell_size + scrollback_lines * cols * cell_size.

For a typical 80x24 terminal with 500 scrollback lines:

  • Per-instance: approximately 1-2 MB
  • 512 MB cap supports roughly 250-500 concurrent automation sessions

If the memory cap is exceeded, new automation requests return HTTP 503 with a vterm_memory_cap error including current and maximum usage.

libvterm instances are evicted after 10 minutes of inactivity (configurable with --vterm-idle-ttl-sec). “Inactivity” means no automation endpoint has been called for that session.

Evicted instances are transparently re-initialized on the next automation call. The replay overhead is typically negligible (a few milliseconds for normal sessions, up to 100ms for sessions with very large output buffers).

Each libvterm instance maintains its own scrollback buffer (default: 500 lines, configurable with --vterm-scrollback-lines, max 10000). This is separate from the terminal session’s raw output buffer.

The scrollback stores rendered cells (text + attributes), making /find and /snapshot with scroll_offset fast — no re-parsing needed.

FlagDefaultDescription
--vterm-memory-cap-mb512Global memory cap for all libvterm instances
--vterm-scrollback-lines500Scrollback lines per instance (0-10000)
--vterm-idle-ttl-sec600Seconds of idle before eviction
--wait-max-waiters-per-session16Max concurrent /wait requests per session

  • Use /wait instead of polling /snapshot in a loop. The wait endpoint is event-driven and returns the instant the condition is met. Polling wastes HTTP round-trips and can miss transient states.

  • Use "mode": "either" when you’re not sure about the exact prompt. Combine a regex for the expected prompt with a stability debounce as a fallback. This handles both normal and error cases.

  • Use /paste for text, /press for actions. Paste your command text, then press Enter. Paste your code, then press Escape. This is faster and more reliable than pressing keys one by one.

  • Check is_fullscreen in snapshots to know if you’re in a TUI program or at a shell prompt. This helps your automation script adapt to unexpected states.

  • Use the seq counter to detect screen changes without comparing all lines. If seq hasn’t changed between two snapshots, the screen is identical.

  • Keep timeout_ms generous for operations that might take time (builds, downloads, SSH connections). The wait endpoint returns immediately when the condition is met — a large timeout just prevents premature failure.

  • Use /find with scope: "all" when searching for output that might have scrolled off screen. The scrollback scope searches only the scrollback buffer, while all searches both.

  • Don’t sleep() between operations. Use /wait with an appropriate mode instead. Sleeping creates race conditions and slows down your automation unnecessarily.

  • Don’t send raw escape codes through /paste. Use /press for special keys and control sequences. Raw escape codes can conflict with the terminal’s current mode.

  • Don’t assume terminal dimensions. Read cols and rows from the snapshot response. Different sessions may have different sizes, and users can resize at any time.

  • Don’t create more than 16 concurrent waiters per session. The limit exists to prevent resource exhaustion. If you need to wait for multiple conditions, use /wait sequentially or combine patterns with regex alternation (pattern1|pattern2).

  • Don’t ignore the status field in wait responses. A timeout is not an error — it means the condition wasn’t met. Your script should handle all five statuses: matched, stable, timeout, exited, and vterm_reinit.

  • Don’t use automation endpoints for simple command execution. If you just need to run ls and get the output, use /api/v1/terminal/execute with wait: true. Automation endpoints are for interactive programs.