Skip to content

keryx — interface contracts (CLI, web UI, MCP)

Status: draft / intent (companion to 0001-keryx.md) Owner: Matt Cockayne Last updated: 2026-06-14

0. How to read this

This defines the interface contracts for keryx's three surfaces — CLI, web UI, and MCP (the GTB mcp feature): the commands, their inputs/outputs/side-effects, the web UI's functionality and UX, and the MCP tools. It exists to anchor tests (the godog scenarios and unit tests of 0001-keryx.md §8) and to guide implementation. The CLI command contracts (§3) are the single source of truth; the web UI (§4) and MCP (§5) are alternative drivers of the same operations on the same workspace.

  • These are guidance, not law. Shapes may change as we build; when they do, update this doc and the tests together. The contract assertions under each command are the bits worth pinning in tests.
  • Requirement IDs (R-AREA-n) give traceability: each should map to at least one unit test or .feature scenario. They are stable handles, not a priority order.
  • Keywords: MUST = a contract a test should enforce; SHOULD = expected but negotiable; MAY = optional.
  • Everything here is consistent with 0001-keryx.md — the CLI surface (§5), themes (§6), providers (§3.4), workspace + iteration model (§3.2), posting (§4), and the studio (§10). Where they disagree, 0001 wins and this is stale.

1. CLI conventions (cross-cutting)

These apply to every command; per-command sections only note departures.

1.1 Invocation & modes

Most generation/iteration commands operate in one of two modes:

  • Workspace mode--workspace <slug> (or run from inside a workspace dir, or the configured default); keryx reads inputs from and writes outputs into that workspace (§3, workspace layout). This is the normal path.
  • Standalone mode — explicit --out / input paths, no workspace; a one-off invocation (what the Python scripts did). Useful for ad-hoc and for testing a single generator.

R-GLOBAL-1 (MUST) A command given --workspace resolves all relative paths against that workspace; given explicit paths instead, it MUST NOT touch any workspace. R-GLOBAL-2 (SHOULD) With neither, a workspace-aware command run inside a workspace dir uses it; otherwise it errors with guidance.

1.2 Global flags

Flag Meaning
--workspace, -w <slug> operate on this reel workspace
--theme <keyword> theme for this command's type; else the type default (§6)
--out, -o <path> output path (standalone mode)
--json machine-readable output on stdout (via GTB pkg/output)
--yes, -y assume yes; never prompt (for CI / non-interactive)
--force bypass the generation cache; regenerate (§3.2)
--dry-run validate + report intended actions; no side effects / no spend
--config, --log-level, -v GTB-standard config + logging

R-GLOBAL-3 (MUST) Output is human-readable by default; --json emits a structured object suitable for scripting/CI, and on --json stdout MUST be valid JSON only (logs go to stderr). R-GLOBAL-4 (MUST) --yes makes a command fully non-interactive (no prompt blocks a pipeline); a command that would need input and can't proceed MUST exit with a usage error, not hang. R-GLOBAL-5 (MUST) --dry-run performs no network calls that spend credit and no writes, and reports what it would do.

1.3 Exit codes & errors

Code Meaning
0 success
2 usage / validation error (bad flags, invalid storyboard.json, missing input)
1 execution failure (provider/API error, ffmpeg failure, partial post all)

R-GLOBAL-6 (MUST) Validation failures (e.g. a schema-invalid storyboard) exit 2 before any side effect or spend. R-GLOBAL-7 (MUST) Errors are reported through the GTB error handler with an actionable hint (e.g. "run keryx doctor"), and the same failure surfaces in --json as a structured error. R-GLOBAL-8 (SHOULD) Long operations show progress on stderr; with --json, progress is suppressed or structured.

1.4 Determinism, caching, idempotency

R-GLOBAL-9 (MUST) Re-running a command with unchanged inputs reuses cached artefacts and does not re-spend on the provider (content-addressed cache, §3.2); --force overrides. R-GLOBAL-10 (SHOULD) Deterministic logic (timing, wrapping, schema, social record, theme/provider resolution) is pure and unit-testable without network/ffmpeg.

Cost & safety (the batch's 8-hour idle hang raised a runaway-spend worry): - R-GLOBAL-11 (MUST) every operation is finite and bounded — generate N then stop; no open-ended loops that could run unattended. - R-GLOBAL-12 (MUST) every provider/ffmpeg call has a timeout; a hung call fails, it doesn't hang the process. - R-GLOBAL-13 (SHOULD) keryx keeps a usage tally (generations + estimated spend) per run/workspace, surfaced in output and --json. Estimates are best-effort from per-provider unit rates (providers.<x>.rates); where a provider exposes a pricing/usage API, keryx refreshes rates periodically (cached, configured rates as fallback) to keep estimates accurate rather than a drifting hardcode. - R-GLOBAL-14 (SHOULD) a spend guard (spend.confirm_above, global + per-project) with per-capability defaults — image_video_usd: 10 and voice_chars: 50000 (ElevenLabs is character-billed) — past which a batch prompts for confirmation (auto-yes under --yes/CI). Prevents runaways, not generation. - R-GLOBAL-15 (MUST) large-file persistence is config-selected (global + per-project, persistence.media.store: git default / git-lfs / external / none); the external store's backend is gitlab-packages (GitLab's built-in Generic Package Registry, preferred when on GitLab) or s3 (S3-compatible — Cloudflare R2 recommended). keryx hardwires no single strategy and is not coupled to GitLab (0001 §3.5).

1.5 Projects & persistence (git)

A project is the owning git repository (0001 §3.2, §3.5); state lives there, git is the persistence layer. - R-GIT-1 (MUST) CLI scopes to the current folder — the project is the git working copy at cwd; no project switcher in the CLI. - R-GIT-2 (MUST) the studio can open multiple projects and switch between them (R-UI-28), including remote git repos opened via go-tool-base git components (on-disk and in-memory) + VCS auth. - R-GIT-3 (MUST) a Save / approve persists files and commits them to the project repo with a descriptive message; for a remote project the studio commits to an in-memory/temp clone and pushes. Auto-commit-on-save and auto-push are config; the default is commit-on-save. - R-GIT-4 (MUST) credentials for remote git use the GTB keychain / CI variables — never committed; a push failure surfaces (and, in CI, alerts). - R-GIT-5 (SHOULD) the mobile studio works against a remote repo without a local checkout (in-memory git), so authoring/approving/posting from a phone is first-class.

Config files & hot reload. - R-CFG-1 (MUST) config resolves in precedence order: CLI flags → env → project .keryx.yaml (repo root) → global ~/.keryx/config.yaml → embedded defaults (GTB pkg/config). - R-CFG-2 (MUST) secrets are never written to either YAML — credentials live in keychain / env / CI variables only; a config write that would persist a secret is refused. - R-CFG-3 (MUST) config is hot-reloaded (GTB Observable): a write to .keryx.yaml (by hand or the Settings panel, R-UI-30) propagates to running components without a restart. - R-CFG-4 (SHOULD) a malformed/invalid config write is rejected with field errors and does not clobber the working config.

2. Workspace layout (the contract for state on disk)

A reel workspace is a directory the owning project commits (0001 §3.2 ownership). Commands read/write this shape; tests assert against it.

<workspace>/                 # e.g. blog repo: content/post/<slug>/reel/ (location is the project's choice)
  workspace.yaml             # meta: slug, theme, bundle association, schema version
  storyboard.json            # creative seed: per card text + vo + scene + voice (§3.1)  [committed]
  social.json                # per-platform social set + status/schedule/result (§4.4)  [committed]
  cover.png                  # bookend art (from `keryx cover` or supplied)            [committed]
  vo/NN.mp3                  # SELECTED narration per card (NN = card index)            [committed]
  cards/NN.{png,mp4}         # SELECTED card media (image|video, generated|uploaded)    [committed]
  music.mp3                  # SELECTED bed                                             [committed]
  reel.mp4                   # rendered output                                         [committed]
  # --- disposable, git-ignored (regenerable; cleared by `keryx reel prune`) ---
  vo/takes/NN-T.mp3          # candidate VO takes                                       [ignored]
  cards/takes/NN-T.{png,mp4} # candidate card media                                     [ignored]
  music/takes/T.mp3          # candidate beds                                           [ignored]
  .cache/                    # content-addressed generation cache (§3.2)                [ignored]

R-WS-1 (MUST) selected artefacts (vo/NN.mp3, cards/NN.*, music.mp3) are what assembly consumes; */takes/* are candidates. select promotes a take to the selected slot. R-WS-2 (MUST) workspace.yaml records the schema version so an old workspace fails loudly rather than mis-rendering. R-WS-3 (SHOULD) the exact location of the workspace within the owning repo is the project's choice; keryx only requires the internal shape. R-WS-22 (MUST) only the [committed] files are tracked in git; takes/ and .cache/ are git-ignored (disposable, regenerable) so the repo doesn't bloat — social.json (not a separate ledger) holds the posting status/result. R-WS-23 (MUST) on a reel with an associated bundle (§3.1), the deliverables are also written back into that directory: the reel-<slug>.mp4 and the human-readable reel-caption.md (R-CAP-4).

3. CLI command contracts

Format per command: synopsis · purpose · in · out · side-effects · exit · contract assertions.

3.1 Workspace & storyboard

keryx reel new <slug> [--from-post post.md] [--theme <kw>] - Purpose: scaffold a workspace. - In: slug; optional source post; theme. - Out: a new <workspace>/ with workspace.yaml + an empty/derived storyboard.json; prints the path. - Side-effects: creates the dir. With --from-post, MAY invoke storyboard draft (needs an LLM provider). - Exit: 2 if the slug exists or is invalid; 0 on create. - Contracts: R-WS-4 (MUST) creates a schema-valid (possibly empty) storyboard; R-WS-5 (MUST) refuses to clobber an existing workspace without --force; R-WS-15 (MUST) --bundle <dir> records the association (below).

keryx reel list|open|rename|duplicate|rm <slug> — project reel CRUD - Purpose: manage the set of reels in a project (a project has many). - In: the project's reel root (discovered/configured); slug(s). - Out: list prints each reel with status (draft/approved/posted, from social.json) + associated bundle (human or --json); the others mutate. - Side-effects: duplicate copies a workspace (storyboard + theme, not the posting status); rm deletes (confirm unless --yes); rename re-slugs. - Contracts: R-WS-16 (MUST) list enumerates every workspace under the project root with its status; R-WS-17 (MUST) duplicate copies authoring inputs but resets posting status in social.json (a copy hasn't been posted); R-WS-18 (MUST) rm/rename are guarded (confirm / no clobber); these are the CRUD parity for the studio reels library (R-UI-24).

keryx reel link <slug> <dir> — associate a reel with a content directory - Purpose: record the reel's associated content (a page bundle or any directory, §3.2) — where the reel belongs and a context source for the AI. - Out: writes the association to workspace.yaml. - Contracts: R-WS-19 (MUST) the linked dir is recorded and the finished reel-<slug>.mp4 write-back targets it; R-WS-20 (MUST) keryx reads the dir's text/assets as read-only context for storyboard draft / chat; R-WS-21 (SHOULD) any directory is accepted (not Hugo-specific); a missing/ unreadable dir warns, doesn't crash.

keryx storyboard draft <post.md> [-o board.json] [--theme <kw>] - Purpose: AI first-draft of a storyboard from a post (optional helper). - In: post markdown; theme (for tone/style hints). - Out: a storyboard.json (stdout or -o/workspace) — a draft for human edit, never auto-final. - Side-effects: one LLM call (GTB chat client). - Exit: 2 if no LLM provider configured (with a hint); 0 on draft. - Contracts: R-WS-6 (MUST) output validates against the storyboard schema; R-WS-7 (MUST) keryx runs fully without this command (hand-authored JSON is always accepted); R-WS-8 (SHOULD) marks the draft as unreviewed in workspace.yaml.

Storyboard validation rules (the schema's semantic checks — surfaced identically by the CLI exit 2, the API 422 (R-API-1), and the studio inline validation (R-UI-6); discovered while building the studio mockup): - R-WS-9 (MUST) every card has non-empty text (a mono URL card may be the bare URL). - R-WS-10 (MUST) a card with mode: overlay MUST have a non-empty scene. - R-WS-11 (MUST) *accent* markers are balanced (even count) per card. - R-WS-12 (MUST) palette roles reference defined palette keys; role applicability is mode-dependentblock uses bg/fg/accent; overlay ignores bg (the illustration fills it) and uses fg/accent for the scrim-overlaid text. A bg on an overlay card is accepted but inert (SHOULD warn). - R-WS-13 (SHOULD) the last card is the mono URL closer (a strong convention, warned not errored). - R-WS-14 (MUST) cover bookend cards reference an available cover image at render time (validated at assemble, not author, time).

3.2 Generation

keryx cover --scene "..." [--theme <article kw>] [--n N] [-o cover.png] - In: scene prompt; article theme (style prefix + palette); sample count. - Out: N PNG samples; prints paths. - Side-effects: N image-provider calls (spend). Exit: 1 on provider error. - Contracts: R-GEN-1 (MUST) resolves an article-type theme; R-GEN-2 (MUST) honours --n; R-GEN-3 (SHOULD) appends the hardened wordless instruction so covers carry no text.

keryx portrait --ref photo.jpg [--ref ...] [--theme <kw>] [--n N] [-o] - In: one or more reference photos; portrait theme; variants. - Out: N stylised avatars. Side-effects: image-provider calls. - Contracts: R-GEN-4 (MUST) --ref repeatable; R-GEN-5 (MUST) resolves a portrait-type theme; avatar pipeline is independent of the reel pipeline.

keryx voice [--line N] [--text "..."] [--takes N] [--stability x] [--similarity y] [--theme <kw>] [-w|--out vo/] - In: in workspace mode the narration is each card's vo field (distinct from the on-screen text, §3.1) plus its per-card voice override; --text is the standalone single-line path; reel-theme voice settings are the default, overridable per line by flags. - Out: mp3(s) into vo/takes/ (workspace) or --out (standalone); prints durations. - Side-effects: TTS calls (cached per vo-text + settings). - Contracts: R-GEN-6 (MUST) --line N regenerates exactly one line, leaving others untouched; R-GEN-7 (MUST) --takes N produces N candidates; R-GEN-8 (MUST) unchanged vo-text + settings → cache hit, no re-spend (R-GLOBAL-9); R-GEN-24 (MUST) the vo text is passed through to the provider including control tags (ElevenLabs SSML <break>, phonetic spellings) — the batch used <break time="0.8s"/> for a beat; R-GEN-25 (SHOULD) a per-line --stability/--similarity is persisted to that card's voice override so a later re-build/--force reproduces it (the batch bumped single lines to 0.74–0.76). Note: ElevenLabs' pronunciation dictionary is a gated Studio feature (no API) — the available levers are higher stability + phonetic spelling in the vo text.

keryx voice select <line> <take> (workspace) - Purpose: promote a candidate take to the selected slot. - Out: vo/NN.mp3vo/takes/NN-T.mp3. Side-effects: workspace write only. - Contracts: R-GEN-9 (MUST) selection updates only that line's selected slot and is recorded so re-assembly uses it; replaces the manual cp step.

keryx music [--prompt "..."] [--length <dur>] [--takes N] [--theme <kw>] [-w|--out bed.mp3] - In: prompt (else theme's music tone); length (else auto from the VO-driven total, 0001 §3.1); takes. - Out: bed mp3(s). Side-effects: music-provider calls. - Contracts: R-GEN-10 (MUST) in workspace mode with no --length, length is computed from the assembled timeline; R-GEN-11 (MUST) --takes/select parallel to voice. keryx music select <take> promotes the bed.

keryx cards [--card N] [--takes N] [--video] [--theme <kw>] (workspace) - Purpose: generate per-card overlay media from each card's scene — a still by default, or a short video with --video (video provider, §3.4). - In: reel theme illustration style; optional single --card; takes. - Out: candidates into cards/takes/ (.png or, for video, a clip). - Side-effects: image- or video-provider calls with the hardened wordless prompt. - Contracts: R-GEN-12 (MUST) only overlay cards with a scene are generated; R-GEN-13 (MUST) --card N re-rolls one card; R-GEN-14 (SHOULD) candidates are flagged for likely text-leak (see sheet); R-GEN-21 (MUST, deferred — Phase 5) --video routes to the video provider and the renderer fits the clip to the card's VO-driven duration (§3.1). The seam/schema/UI exist from the start (the "video-shaped hole"); generated-video + video compositing are built last. Uploaded video (R-GEN-22) works without this. Full feature spec: 0003-video-panels.md (R-VID-*).

keryx cards set <card> <file> — use pre-rendered media instead of AI - Purpose: assign a user-supplied image or video to a card, bypassing generation (the editor file picker, R-UI-29, does the same). - Out: copies the file into the workspace and sets the card's media (source: uploaded, kind by file type); marks it selected. - Contracts: R-GEN-22 (MUST) accepts image and video files, validates type/ dimensions/aspect (warns on non-9:16), and uses it as-isno AI call, no text-leak screen (the user owns the media); R-GEN-23 (MUST) an uploaded card needs no scene and is not overwritten by a later keryx cards run unless --force.

keryx cards select <card> <take> / keryx cards sheet [-o sheet.png] - select: promote a clean generated take to the card's media slot. - sheet: render a contact sheet of current card media for a one-look text-leak screen. Contracts: R-GEN-15 (MUST) sheet shows every card with its index; R-GEN-16 (MUST) select sets the slot assembly consumes; R-GEN-26 (SHOULD) keryx auto-screens candidates for text leaks via OCR — flag/reject any with detected glyphs (the batch caught these by eye), the contact sheet being the human fallback; R-GEN-27 (MUST) the scene sent to the provider is pure visual description — directive/metaphor clauses leak as captions (§3.1).

keryx reel build [--storyboard board.json --cover cover.png | -w <slug>] [--silent] [--only-line N] [--force] [-o reel.mp4] - Purpose: assemble the 9:16 reel. (reel is the lifecycle noun group, §3.1 — build is the assemble verb, so bare keryx reel is never an action.) - In: storyboard + cover + selected vo/music/cards (workspace), or explicit paths (standalone); reel theme. - Out: reel.mp4; prints duration + dimensions. - Side-effects: ffmpeg render (no provider spend; consumes existing assets). - Exit: 2 on invalid storyboard; 1 on ffmpeg failure. - Contracts: R-GEN-17 (MUST) --silent renders dur-timed, audio-free output (no VO/music needed) for the fast proof; R-GEN-18 (MUST) output is 1080×1920 H.264+AAC and duration ≈ Σ(VO+lead+tail) − xfades (probe-tested, §8); R-GEN-19 (MUST) --only-line N re-assembles reusing unchanged takes; R-GEN-20 (MUST) the final card is the dedicated URL closer; orphan-controlled wrapping applies.

keryx social [-w <slug>] [--platform <p>] (was caption) - Purpose: compose the social elements for the reel — supporting text, hashtags, link, title — that posting consumes (PostMeta, 0001 §4.1). - In: the reel's storyboard / source (and associated bundle, §3.1) for context; the LLM provider; --platform to target one platform. - Out: a per-platform social set persisted in the workspace (e.g. social.json) and printed; --platform emits just that platform's set. - Contracts: R-CAP-1 (MUST) produces a base set (caption, tags, link, title) and, per --platform, a platform-appropriate variant; R-CAP-2 (MUST) each variant respects that platform's constraints (§4.4) — caption length, hashtag count/style, link handling (clickable vs "link in bio"), title vs description — and flags violations; R-CAP-3 (SHOULD) the studio Publish panel (R-UI-25) reads/writes the same social.json; R-CAP-4 (MUST) on a reel with an associated bundle, also writes a human-readable caption file (reel-caption.md, per-platform prose) into the bundle for manual/ cross-check posting — an explicit batch requirement, a first-class committed deliverable beside reel-<slug>.mp4 (social.json stays the machine form in the workspace).

3.3 Themes

keryx theme list|show|add|edit|rm (per 0001 §6.3) - Contracts: R-THEME-1 (MUST) list [--type] shows the catalog grouped by type; R-THEME-2 (MUST) add --type <t> [--from <kw>] creates/clones and persists via GTB config; R-THEME-3 (MUST) keyword is unique within a type (same keyword across types is allowed); R-THEME-4 (MUST) edit round-trips (write then show reflects it); R-THEME-5 (SHOULD) rm of a theme that is a defaults.<type> is refused or reassigns the default with confirmation.

3.4 Posting & approval

Each platform's social record (§4.4) holds a status draft → approved → posted, an optional scheduled_at, and (once posted) posted_at + post_url. This record is the idempotency ledger.

keryx approve <platform|all> [-w <slug>] [--at <datetime>] [--revoke] - Purpose: the safety gate — mark a platform's social set ready to post, and optionally schedule it. - Out: sets status approved (+ scheduled_at if --at); --revoke returns it to draft. Persisted (a git commit, §1.5). - Contracts: R-POST-9 (MUST) approval is required before any post; R-POST-10 (MUST) approving a platform whose social fails its constraints (§4.4) is refused with the violations; R-POST-11 (SHOULD) --at sets the per-outlet schedule the unattended runner consumes.

keryx post <platform> <file|‑w> ... [--dry-run] [--force] - In: video (or workspace reel.mp4); the platform's social set; token. - Out: post id/URL; updates the record to posted + posted_at + post_url. - Exit: 2 if not authed or not approved; 1 on API failure; 0 on publish / --dry-run pass. - Contracts: R-POST-1 (MUST) --dry-run validates auth + video vs the platform's limits and posts nothing; R-POST-2 (MUST) refuses unless status is approved (the accidental-posting guard); R-POST-3 (MUST) a posted platform is a no-op (idempotent) unless --force; on success sets posted + posted_at + post_url; R-POST-4 (SHOULD) bounded retry/backoff; rate limits respected. On-demand posting is available from CLI and the studio (R-UI-27); unattended posting is CI-only.

keryx post all [-w <slug>|<file>] [--dry-run] (one reel, all platforms) - Contracts: R-POST-5 (MUST) posts each enabled + approved platform independently; R-POST-6 (MUST) records each success even if another fails; R-POST-7 (MUST) exits non-zero if any failed, per-platform results in --json; R-POST-8 (MUST) resumable/idempotent (skips posted).

keryx post due [-w <slug>] — the CI / scheduled entrypoint - Purpose: the unattended path. Without -w, scans all of the project's reels; with -w, one reel. Posts only approved platforms whose scheduled_at is due. - Contracts: R-POST-12 (MUST) posts exactly the approved + due platforms across the targeted scope and is idempotent (R-POST-3); R-POST-13 (MUST) is the only command CI runs unattended; a not-yet-due or unapproved platform is skipped (not an error).

3.5 Auth & tokens

keryx auth <platform> - Purpose: interactive OAuth capture (local browser). - Out: token stored via GTB keychain; prints success + expiry. - Contracts: R-AUTH-1 (MUST) interactive only (not for CI); R-AUTH-2 (MUST) stored secret never printed/logged (GTB redaction).

keryx auth refresh [--platform <p>] [--dry-run] (was refresh-tokens) - Purpose: CI job — refresh/rotate tokens and write back to the writable store (0001 §4.2). - Contracts: R-AUTH-3 (MUST) persists a rotated refresh token immediately (TikTok rotates every use); R-AUTH-4 (MUST) on failure exits non-zero and alerts (silent staleness is the top risk); R-AUTH-5 (MUST) --dry-run reports token ages/expiries without mutating.

3.6 Tool / GTB defaults

init, doctor, config, update, docs, changelog, keychain come from GTB. keryx extends two: - R-TOOL-1 (MUST) keryx init seeds the theme catalog + default config in the owning project (0001 §6.2). - R-TOOL-2 (MUST) keryx doctor checks ffmpeg/ffprobe + fonts present and versioned, provider credentials resolve, and each enabled platform token is non-stale; exits non-zero if a required dependency is missing (gates CI).

4. Web UI — functionality, UX, and API contract

The studio (0001 §10) is a local, single-user web app started by keryx studio. It is a richer front-end over the same workspace; anything it does maps to a CLI operation on that workspace.

4.1 Functional requirements

v1 (required) — author & adjust: - R-UI-28 (MUST) Project switcher — the studio's top level is a project picker (a user has many projects). A project may be a local directory or a remote git repo (§1.5); switching reloads that project's reels/config/themes. The CLI has no switcher (it scopes to the cwd, 0001 §3.5). - R-UI-30 (SHOULD) Project settings — a Settings panel (project-scoped) to view/edit themes, AI providers (providers.{image,video,voice,music,render}), platform enablement, and defaults. It writes the project's .keryx.yaml (R-CFG-1), rejects invalid writes (R-CFG-4), never writes secrets (R-CFG-2 — credentials are managed separately), and changes take effect via hot reload (R-CFG-3). Reached from the project/library level (and the ⋯ overflow on mobile). - R-UI-24 (MUST) Reels library (CRUD) — within a project the studio opens on a list of its reels, not a single reel. Create / open / duplicate / rename / delete, each row showing status (draft → approved → posted) and its associated content. The single-reel editor is the drill-in. Mirrors keryx reel list|open|rename|duplicate|rm (R-WS-16..18); UI↔CLI parity. - R-UI-1 (MUST) Open/select a reel workspace and load its storyboard.json. - R-UI-2 (MUST) Compose the storyboard: add / remove / reorder cards; edit per card: on-screen text and the separate vo narration (§3.1 — the batch proved they differ; the vo field accepts SSML <break> and a per-line voice stability override), palette roles, accent words, mode (block/overlay), scene, cover/mono flags. The editor is mode-adaptive (R-UI-17): block shows bg/fg/accent; overlay shows fg/accent + scene + the illustration slot and treats bg as inert (the image fills it) — per the validation rules R-WS-12. - R-UI-3 (MUST) Upload images: cover art and portrait reference photos, saved into the workspace. - R-UI-29 (MUST) Per-card media source — for an overlay card the editor offers Generate (AI) or Upload (a file picker for a pre-rendered image or video). Uploaded media is used as-is (no AI call, no text-leak screen, R-GEN-22); generated video uses the video provider (R-GEN-21). A card backed by uploaded media needs no scene. The card shows which source/ kind it has (still vs video, generated vs uploaded). - R-UI-4 (MUST) Associated content / source (R-UI-18): an "Associated content" panel that either links a content directory (a page bundle or any directory, R-UI-26) or holds pasted source text. It feeds storyboard draft and chat ("draft from this post") and records where the reel belongs (§3.2). The seed text/context has a home, not only the chat box. - R-UI-26 (MUST) Bundle association: link/unlink the reel to a directory; the UI shows the dir's detected files (post markdown, cover, assets) and uses them as read-only AI context; the finished reel write-back targets it. Any directory is accepted (not Hugo-specific); optional. - R-UI-5 (MUST) Chat to adjust: a streaming chat (GTB chat client) that proposes storyboard edits — including structural ones (add / remove / reorder cards), not just text (see R-UI-15 / R-API-2); the human accepts/rejects — nothing auto-applies. - R-UI-6 (MUST) Validation feedback: invalid storyboard state is shown inline against the R-WS-9..14 rules and blocks save of broken JSON. - R-UI-7 (MUST) Save writes storyboard.json (the same file the CLI uses). - R-UI-23 (MUST) Unsaved-changes guard: edits are local until Save; the UI shows a dirty indicator and warns before discarding on workspace/theme switch, reload, or close. (Spotted building the mockup — nothing protected local edits.)

v2 (optional / later) — preview & produce: - R-UI-8 (MAY) Render + view the silent draft in-browser. Note: the v1 per-card preview is necessarily approximate for overlay cards (palette + text + image placeholder); the faithful preview of an overlay card requires its generated illustration, so true preview is this v2 silent draft. - R-UI-9 (MAY) Generate/audition voice & music takes and select them. - R-UI-10 (MAY) Generate/re-roll card illustrations, with the contact-sheet text-leak screen. - R-UI-11 (MAY) Trigger a full render and preview the reel. - R-UI-12 (MAY) Review the posting status (social.json) / trigger posting (likely stays CLI/CI). - R-UI-19 (MAY) View / edit the generated social set in the studio. - R-UI-25 (SHOULD) Publish / social composition (per platform) — a panel to generate and edit each platform's social elements (supporting text, hashtags, link, title) for Instagram / YouTube / TikTok / LinkedIn, that steers the user to each platform's conventions: live character count vs the cap, hashtag-count guidance, link-handling hint (clickable vs "link in bio"), and title-vs-description where relevant (the §4.4 constraints). Reads/writes social.json (keryx social, R-CAP-3). - R-UI-27 (SHOULD) Approve, schedule & post on demand — per platform the Publish panel shows the status (draft → approved → posted), an Approve toggle (gating posting, R-POST-2; refused if constraints fail), an optional schedule date/time (scheduled_at, R-SOC-7), and a Post now action enabled only when approved. On-demand posting from the UI is human-initiated; unattended posting remains CI-only (§4.3) and MCP posting tools stay gated (R-MCP-2). Posting from the UI calls the same path as keryx post.

4.2 UX

CLI UX principles (cross-cutting): human-readable by default with --json for scripts; long operations stream progress; spend/destructive actions confirm unless --yes; errors carry an actionable hint; everything is workspace-relative so a session is resumable. The CLI is the complete interface — the UI never exposes anything the CLI can't do (R-UI-13, MUST: UI↔CLI parity on the workspace).

Visual language. Minimalist and modern: a neutral light base (white / near-white panels, soft slate text, hairline borders), with teal and amber as restrained accents only — active states, primary actions, links (teal), and highlights / accent words / key CTAs (amber). Not large blocks of brand colour; the palette earns its place in the reel, the tool chrome stays quiet. Dark mode is a nice-to-have, not required.

Responsive & mobile-first (R-UI-20, MUST). The studio MUST be usable on a phone — the primary device in practice. The top bar collapses to essentials on narrow screens (back · reel name · validation · Save), moving secondary controls (project switcher, theme) into an overflow menu and dropping the Edit/Publish segment in favour of the bottom tabs — it must not overflow/scroll horizontally on a ~400px device. Layout adapts by width: - Desktop (wide): three panes — card list · editor · chat — with the card list and chat panes collapsible (R-UI-21, MUST) to give the editor room on smaller displays; collapsed state persists. - Mobile (narrow): a single column with bottom tab navigation between Cards / Editor / Chat (and Source / Assets); the chat is a first-class tab because on mobile it is the primary authoring surface (propose→accept is the main interaction). Touch targets ≥44px; card reorder works by touch (handle drag or up/down controls).

Three levels. (1) Reels library (the home — R-UI-24), (2) the reel editor (storyboard authoring), (3) the Publish panel (per-platform social, R-UI-25). The library is the entry; a reel drills into editor + publish (a tab/segment within the reel).

 Library (home)                Reel editor (desktop, panes collapsible)
┌─────────────────────────┐   ┌──────────────────────────────────────────┐
│ Reels        [+ new]    │   │ ‹reels  everyone-wants-rust ▾  [Edit|Publish]│
│ ┌─────────────────────┐ │   ├────────┬───────────────────────┬─────────┤
│ │ everyone-wants-rust │ │   │ CARDS◧ │ EDITOR                │ CHAT  ◨ │
│ │ ● approved · 4 posts│ │   │ 1 hook │ text / accent         │ propose │
│ │ ↳ content/post/…    │ │   │ 2 …    │ mode ▸ overlay|block  │  diff   │
│ ├─────────────────────┤ │   │ 3 …◀   │ scene + image (ovl)   │ [✓][✗]  │
│ │ switched-it-off     │ │   │ +add   │ fg/accent · flags     │         │
│ │ ○ draft             │ │   ├────────┴───────────────────────┴─────────┤
│ │  open  dup  rename ⋮│ │   │ Associated: content/post/…  · assets     │
│ └─────────────────────┘ │   └──────────────────────────────────────────┘
└─────────────────────────┘   Publish panel: per-platform tabs (IG/YT/TT/LI),
 Mobile: same list; tap a       each: supporting text + live char-count vs cap,
 reel → editor (bottom tabs      hashtags, link (clickable | link-in-bio), title.
 Cards/Editor/Chat/Publish).
- R-UI-14 (MUST) Reordering cards updates their order in storyboard.json. A drag-and-drop handle is the primary method — a generously sized (≥44px) grab handle per card, working with touch and pointer (the small up/down arrows were poor touch targets on mobile). A keyboard-accessible move (focus handle + arrow keys, or a menu action) is required for a11y so reorder isn't drag-only. - R-UI-15 (MUST) The chat proposes a storyboard patch (one or more ops: edit-text, set-field, add/remove/reorder card) shown as a reviewable diff; accept applies it to the working storyboard (still unsaved until Save), reject discards — the human gate is explicit and visible. Multi-op patches are accept/reject as a unit (per-op granularity MAY come later). - R-UI-16 (SHOULD) The card list shows each card's mode + whether its illustration/VO take is present/selected (asset status at a glance). - R-UI-22 (SHOULD) Theme switch while editing re-applies the theme's card treatment to previews and flags that existing card illustrations were generated in the previous theme's style (offer to re-roll), rather than silently mismatching.

4.3 HTTP API contract (illustrative)

The frontend talks to a thin keryx HTTP API over the workspace (served by GTB pkg/http). Endpoints map 1:1 to CLI operations; both are tested against the same workspace assertions.

Method · path CLI analogue Notes
GET /api/projects · POST .../switch (cwd; studio-only) list/switch projects, incl. remote git (R-UI-28, R-GIT-2)
GET·PUT /api/config edits .keryx.yaml Settings panel; 422 on invalid; no secrets (R-UI-30, R-CFG-2..4)
GET /api/reels reel list project reels + status + association (library, R-UI-24)
POST /api/reels reel new create; {slug, theme, bundle?}
POST /api/reels/{slug}:duplicate · :rename reel duplicate/rename CRUD
DELETE /api/reels/{slug} reel rm guarded
GET /api/workspace/{slug} (read workspace) storyboard + asset/social status
PUT /api/workspace/{slug}/storyboard save storyboard validated; 422 on schema error
POST /api/workspace/{slug}/assets (upload) cover / ref image upload
PUT /api/workspace/{slug}/bundle reel link associate a content dir (R-UI-26)
PUT /api/workspace/{slug}/source (hold source text) pasted source feeding draft/chat
POST /api/workspace/{slug}/chat (SSE) storyboard draft + edits streams tokens + a proposed patch (ops)
GET·PUT /api/workspace/{slug}/social social per-platform social set (R-UI-25)
POST /api/workspace/{slug}/social/{platform}:approve approve set approved + schedule (R-UI-27)
POST /api/workspace/{slug}/social/{platform}:post post on-demand; requires approved (R-POST-2)
POST /api/workspace/{slug}/generate voice/music/cards/reel build async; {target,...}; v2
GET /api/workspace/{slug}/preview/silent reel --silent v2
GET /healthz (GTB) liveness
  • R-API-1 (MUST) PUT storyboard rejects schema-invalid bodies with 422 and field-level errors, and on success writes the same storyboard.json the CLI reads. R-API-2 (MUST) the chat endpoint never auto-writes the storyboard; it returns a proposed edit the client must confirm (mirrors R-UI-15). R-API-3 (MUST) the server binds localhost by default and is not exposed without explicit opt-in + auth (0001 §10.2). R-API-4 (SHOULD) the API is versioned (/api/v1/...). R-API-5 (MUST) the chat patch is a structured set of storyboard ops (edit-text / set-field / add / remove / reorder) the client validates and applies on accept — never applied server-side unconfirmed (mirrors R-UI-15). R-API-6 (MUST) a patch is generated against a storyboard revision; on accept the client rebases/validates it against the current working state and rejects (re-proposes) if it no longer applies cleanly — a stale proposal must not silently corrupt the storyboard.

4.4 Per-platform social constraints (steering reference)

The Publish panel (R-UI-25) and keryx social --platform (R-CAP-2) steer the user to each platform's norms. The values below are guidance to encode + verify against live docs, not gospel — they drive the UI's char-counts, hashtag hints, and link handling. (mid-2026; confirm before relying on numerics.)

Platform Primary text Hashtags (norm) Links Title
Instagram caption ~2,200 chars several–~10 (cap 30) not clickable → "link in bio"
YouTube (Shorts) description (long) a few, in description clickable in description title required (~100 chars)
TikTok caption (short, ~2,200 but front-load) a few, trend-led limited clickability
LinkedIn post text (front-load ~140) sparing (1–3) clickable; first link unfurls
  • R-SOC-1 (MUST) the UI shows a live character count vs the platform cap and flags overflow before posting. R-SOC-2 (MUST) link handling is surfaced per platform (clickable vs "link in bio") so the caption is composed accordingly. R-SOC-3 (SHOULD) hashtag guidance reflects the platform norm (e.g. warn on LinkedIn hashtag spam, allow more on Instagram). R-SOC-4 (SHOULD) AI generation produces a platform-appropriate draft (tone + shape), not one caption copy-pasted everywhere. R-SOC-5 (SHOULD) constraints live in config (per-platform), so they can be tuned as platforms change without a rebuild — consistent with keryx's config-driven posture.

Social record (social.json) — per platform:

{ "instagram": {
    "text": "…", "hashtags": ["#rust"], "link": "https://…", "title": "",
    "status": "draft",        // draft → approved → posted  (R-POST-2 gate)
    "scheduled_at": null,      // per-outlet date/time the scheduler consumes
    "posted_at": null, "post_url": null   // set on success; the idempotency record
  }, "youtube": { … }, "tiktok": { … }, "linkedin": { … } }
- R-SOC-6 (MUST) status is per platform (draft|approved|posted); only approved may post (R-POST-2); posted carries posted_at+post_url and is the idempotency record. R-SOC-7 (MUST) scheduled_at is per platform and is what the unattended runner consumes (keryx post due, R-POST-12). R-SOC-8 (SHOULD) editing an approved/posted platform's text drops it back to draft (re-approval required) so a posted variant can't drift silently.

5. MCP interface (GTB mcp feature)

keryx mcp runs an MCP server (GTB-provided) that exposes keryx's commands as MCP tools so an AI assistant can drive keryx programmatically — drafting storyboards, running takes/select, assembling, composing social. This is what "enabling MCP aids the contract promises" means: the MCP tool schemas are derived from the same command definitions as the CLI (§3), so there is one contract enforced across all three surfaces — no second, drifting definition.

  • R-MCP-1 (MUST) Each exposed command maps to one MCP tool whose input schema reflects that command's args/flags (§3); the tool's behaviour and the CLI's MUST be the same operation on the same workspace.
  • R-MCP-2 (MUST) Tool exposure is tiered by side-effect. Read/author/ generate tools (storyboard draft, reel new, voice/music/cards and their select, reel build, social, theme list/show) are exposed by default. Tools that publish or spend irreversibly or touch secrets (post, post all, post due, approve, auth, auth refresh) are gated — opt-in via config and off by default — so an assistant cannot post publicly or rotate tokens without explicit enablement.
  • R-MCP-3 (MUST) Tools that spend credit or write default to a dry-run / confirm posture where the protocol allows, mirroring --dry-run (R-GLOBAL-5); destructive effects require explicit, non-default invocation.
  • R-MCP-4 (SHOULD) Tool results are the structured (--json) form of the command output (R-GLOBAL-3), so an assistant gets machine-readable results.
  • R-MCP-5 (MUST) The MCP server respects the same config, themes, providers, and workspace resolution as the CLI; it introduces no bypass of validation (R-GLOBAL-6) or the idempotency ledger (R-POST-3).
  • R-MCP-6 (SHOULD) Transport is local (stdio) for an assistant on the same host; any networked exposure is opt-in and authenticated (cf. the studio, 0001 §10.2).

This makes the studio's chat (§4.1 R-UI-5) and an external assistant two clients of the same MCP-able operations, and keeps the propose→confirm human gate intact: an assistant may propose and generate, but publishing stays gated and human-driven.

6. Traceability

Each R-* requirement maps to at least one godog scenario or unit test (0001 §8). The high-value contract scenarios to cover first: the authoring loop (reel newstoryboard draftreel build --silentvoice --takes/selectcards/selectreel buildsocial), post/post due (success / partial / idempotent retry / not-due-skip / --dry-run), auth refresh rotation + alert-on-failure, theme add/edit/resolve, storyboard schema validation (CLI exit 2 and API 422), the studio chat propose→confirm gate, and MCP tool exposure tiers (a gated tool like post is not callable without opt-in, R-MCP-2).