keryx — interface contracts (CLI, web UI, MCP)¶
Status: draft / intent (companion to 0001-keryx.md)
Owner: Matt Cockayne
Last updated: 2026-06-14
0. How to read this¶
This defines the interface contracts for keryx's three surfaces — CLI, web
UI, and MCP (the GTB mcp feature): the commands, their
inputs/outputs/side-effects, the web UI's functionality and UX, and the MCP
tools. It exists to anchor tests (the godog scenarios and unit tests of
0001-keryx.md §8) and to guide implementation. The CLI command contracts
(§3) are the single source of truth; the web UI (§4) and MCP (§5) are
alternative drivers of the same operations on the same workspace.
- These are guidance, not law. Shapes may change as we build; when they do, update this doc and the tests together. The contract assertions under each command are the bits worth pinning in tests.
- Requirement IDs (
R-AREA-n) give traceability: each should map to at least one unit test or.featurescenario. They are stable handles, not a priority order. - Keywords: MUST = a contract a test should enforce; SHOULD = expected but negotiable; MAY = optional.
- Everything here is consistent with
0001-keryx.md— the CLI surface (§5), themes (§6), providers (§3.4), workspace + iteration model (§3.2), posting (§4), and the studio (§10). Where they disagree, 0001 wins and this is stale.
1. CLI conventions (cross-cutting)¶
These apply to every command; per-command sections only note departures.
1.1 Invocation & modes¶
Most generation/iteration commands operate in one of two modes:
- Workspace mode —
--workspace <slug>(or run from inside a workspace dir, or the configured default); keryx reads inputs from and writes outputs into that workspace (§3, workspace layout). This is the normal path. - Standalone mode — explicit
--out/ input paths, no workspace; a one-off invocation (what the Python scripts did). Useful for ad-hoc and for testing a single generator.
R-GLOBAL-1 (MUST) A command given --workspace resolves all relative paths
against that workspace; given explicit paths instead, it MUST NOT touch any
workspace. R-GLOBAL-2 (SHOULD) With neither, a workspace-aware command run
inside a workspace dir uses it; otherwise it errors with guidance.
1.2 Global flags¶
| Flag | Meaning |
|---|---|
--workspace, -w <slug> |
operate on this reel workspace |
--theme <keyword> |
theme for this command's type; else the type default (§6) |
--out, -o <path> |
output path (standalone mode) |
--json |
machine-readable output on stdout (via GTB pkg/output) |
--yes, -y |
assume yes; never prompt (for CI / non-interactive) |
--force |
bypass the generation cache; regenerate (§3.2) |
--dry-run |
validate + report intended actions; no side effects / no spend |
--config, --log-level, -v |
GTB-standard config + logging |
R-GLOBAL-3 (MUST) Output is human-readable by default; --json emits a
structured object suitable for scripting/CI, and on --json stdout MUST be
valid JSON only (logs go to stderr). R-GLOBAL-4 (MUST) --yes makes a command
fully non-interactive (no prompt blocks a pipeline); a command that would need
input and can't proceed MUST exit with a usage error, not hang. R-GLOBAL-5
(MUST) --dry-run performs no network calls that spend credit and no
writes, and reports what it would do.
1.3 Exit codes & errors¶
| Code | Meaning |
|---|---|
0 |
success |
2 |
usage / validation error (bad flags, invalid storyboard.json, missing input) |
1 |
execution failure (provider/API error, ffmpeg failure, partial post all) |
R-GLOBAL-6 (MUST) Validation failures (e.g. a schema-invalid storyboard) exit
2 before any side effect or spend. R-GLOBAL-7 (MUST) Errors are reported
through the GTB error handler with an actionable hint (e.g. "run keryx
doctor"), and the same failure surfaces in --json as a structured error.
R-GLOBAL-8 (SHOULD) Long operations show progress on stderr; with --json,
progress is suppressed or structured.
1.4 Determinism, caching, idempotency¶
R-GLOBAL-9 (MUST) Re-running a command with unchanged inputs reuses cached
artefacts and does not re-spend on the provider (content-addressed cache, §3.2);
--force overrides. R-GLOBAL-10 (SHOULD) Deterministic logic (timing,
wrapping, schema, social record, theme/provider resolution) is pure and
unit-testable without network/ffmpeg.
Cost & safety (the batch's 8-hour idle hang raised a runaway-spend worry):
- R-GLOBAL-11 (MUST) every operation is finite and bounded — generate N
then stop; no open-ended loops that could run unattended.
- R-GLOBAL-12 (MUST) every provider/ffmpeg call has a timeout; a hung call
fails, it doesn't hang the process.
- R-GLOBAL-13 (SHOULD) keryx keeps a usage tally (generations + estimated
spend) per run/workspace, surfaced in output and --json. Estimates are
best-effort from per-provider unit rates (providers.<x>.rates); where a
provider exposes a pricing/usage API, keryx refreshes rates
periodically (cached, configured rates as fallback) to keep estimates
accurate rather than a drifting hardcode.
- R-GLOBAL-14 (SHOULD) a spend guard (spend.confirm_above, global +
per-project) with per-capability defaults — image_video_usd: 10 and
voice_chars: 50000 (ElevenLabs is character-billed) — past which a batch
prompts for confirmation (auto-yes under --yes/CI). Prevents runaways, not
generation.
- R-GLOBAL-15 (MUST) large-file persistence is config-selected (global +
per-project, persistence.media.store: git default / git-lfs / external /
none); the external store's backend is gitlab-packages (GitLab's
built-in Generic Package Registry, preferred when on GitLab) or s3
(S3-compatible — Cloudflare R2 recommended). keryx hardwires no single
strategy and is not coupled to GitLab (0001 §3.5).
1.5 Projects & persistence (git)¶
A project is the owning git repository (0001 §3.2, §3.5); state lives there,
git is the persistence layer.
- R-GIT-1 (MUST) CLI scopes to the current folder — the project is the git
working copy at cwd; no project switcher in the CLI.
- R-GIT-2 (MUST) the studio can open multiple projects and switch
between them (R-UI-28), including remote git repos opened via go-tool-base
git components (on-disk and in-memory) + VCS auth.
- R-GIT-3 (MUST) a Save / approve persists files and commits them to the
project repo with a descriptive message; for a remote project the studio
commits to an in-memory/temp clone and pushes. Auto-commit-on-save and
auto-push are config; the default is commit-on-save.
- R-GIT-4 (MUST) credentials for remote git use the GTB keychain / CI variables
— never committed; a push failure surfaces (and, in CI, alerts).
- R-GIT-5 (SHOULD) the mobile studio works against a remote repo without a
local checkout (in-memory git), so authoring/approving/posting from a phone
is first-class.
Config files & hot reload.
- R-CFG-1 (MUST) config resolves in precedence order: CLI flags → env →
project .keryx.yaml (repo root) → global ~/.keryx/config.yaml →
embedded defaults (GTB pkg/config).
- R-CFG-2 (MUST) secrets are never written to either YAML — credentials
live in keychain / env / CI variables only; a config write that would persist a
secret is refused.
- R-CFG-3 (MUST) config is hot-reloaded (GTB Observable): a write to
.keryx.yaml (by hand or the Settings panel, R-UI-30) propagates to running
components without a restart.
- R-CFG-4 (SHOULD) a malformed/invalid config write is rejected with field
errors and does not clobber the working config.
2. Workspace layout (the contract for state on disk)¶
A reel workspace is a directory the owning project commits (0001 §3.2 ownership). Commands read/write this shape; tests assert against it.
<workspace>/ # e.g. blog repo: content/post/<slug>/reel/ (location is the project's choice)
workspace.yaml # meta: slug, theme, bundle association, schema version
storyboard.json # creative seed: per card text + vo + scene + voice (§3.1) [committed]
social.json # per-platform social set + status/schedule/result (§4.4) [committed]
cover.png # bookend art (from `keryx cover` or supplied) [committed]
vo/NN.mp3 # SELECTED narration per card (NN = card index) [committed]
cards/NN.{png,mp4} # SELECTED card media (image|video, generated|uploaded) [committed]
music.mp3 # SELECTED bed [committed]
reel.mp4 # rendered output [committed]
# --- disposable, git-ignored (regenerable; cleared by `keryx reel prune`) ---
vo/takes/NN-T.mp3 # candidate VO takes [ignored]
cards/takes/NN-T.{png,mp4} # candidate card media [ignored]
music/takes/T.mp3 # candidate beds [ignored]
.cache/ # content-addressed generation cache (§3.2) [ignored]
R-WS-1 (MUST) selected artefacts (vo/NN.mp3, cards/NN.*, music.mp3) are
what assembly consumes; */takes/* are candidates. select promotes a take to
the selected slot. R-WS-2 (MUST) workspace.yaml records the schema version so
an old workspace fails loudly rather than mis-rendering. R-WS-3 (SHOULD) the
exact location of the workspace within the owning repo is the project's choice;
keryx only requires the internal shape. R-WS-22 (MUST) only the [committed]
files are tracked in git; takes/ and .cache/ are git-ignored (disposable,
regenerable) so the repo doesn't bloat — social.json (not a separate ledger)
holds the posting status/result. R-WS-23 (MUST) on a reel with an associated
bundle (§3.1), the deliverables are also written back into that directory: the
reel-<slug>.mp4 and the human-readable reel-caption.md (R-CAP-4).
3. CLI command contracts¶
Format per command: synopsis · purpose · in · out · side-effects · exit · contract assertions.
3.1 Workspace & storyboard¶
keryx reel new <slug> [--from-post post.md] [--theme <kw>]
- Purpose: scaffold a workspace.
- In: slug; optional source post; theme.
- Out: a new <workspace>/ with workspace.yaml + an empty/derived
storyboard.json; prints the path.
- Side-effects: creates the dir. With --from-post, MAY invoke
storyboard draft (needs an LLM provider).
- Exit: 2 if the slug exists or is invalid; 0 on create.
- Contracts: R-WS-4 (MUST) creates a schema-valid (possibly empty)
storyboard; R-WS-5 (MUST) refuses to clobber an existing workspace without
--force; R-WS-15 (MUST) --bundle <dir> records the association (below).
keryx reel list|open|rename|duplicate|rm <slug> — project reel CRUD
- Purpose: manage the set of reels in a project (a project has many).
- In: the project's reel root (discovered/configured); slug(s).
- Out: list prints each reel with status (draft/approved/posted, from
social.json) + associated bundle (human or --json); the others mutate.
- Side-effects: duplicate copies a workspace (storyboard + theme, not the
posting status); rm deletes (confirm unless --yes); rename re-slugs.
- Contracts: R-WS-16 (MUST) list enumerates every workspace under the
project root with its status; R-WS-17 (MUST) duplicate copies authoring
inputs but resets posting status in social.json (a copy hasn't been posted);
R-WS-18 (MUST) rm/rename are guarded (confirm / no clobber); these are
the CRUD parity for the studio reels library (R-UI-24).
keryx reel link <slug> <dir> — associate a reel with a content directory
- Purpose: record the reel's associated content (a page bundle or any
directory, §3.2) — where the reel belongs and a context source for the AI.
- Out: writes the association to workspace.yaml.
- Contracts: R-WS-19 (MUST) the linked dir is recorded and the finished
reel-<slug>.mp4 write-back targets it; R-WS-20 (MUST) keryx reads the
dir's text/assets as read-only context for storyboard draft / chat;
R-WS-21 (SHOULD) any directory is accepted (not Hugo-specific); a missing/
unreadable dir warns, doesn't crash.
keryx storyboard draft <post.md> [-o board.json] [--theme <kw>]
- Purpose: AI first-draft of a storyboard from a post (optional helper).
- In: post markdown; theme (for tone/style hints).
- Out: a storyboard.json (stdout or -o/workspace) — a draft for human
edit, never auto-final.
- Side-effects: one LLM call (GTB chat client).
- Exit: 2 if no LLM provider configured (with a hint); 0 on draft.
- Contracts: R-WS-6 (MUST) output validates against the storyboard schema;
R-WS-7 (MUST) keryx runs fully without this command (hand-authored JSON is
always accepted); R-WS-8 (SHOULD) marks the draft as unreviewed in
workspace.yaml.
Storyboard validation rules (the schema's semantic checks — surfaced
identically by the CLI exit 2, the API 422 (R-API-1), and the studio
inline validation (R-UI-6); discovered while building the studio mockup):
- R-WS-9 (MUST) every card has non-empty text (a mono URL card may be the
bare URL).
- R-WS-10 (MUST) a card with mode: overlay MUST have a non-empty scene.
- R-WS-11 (MUST) *accent* markers are balanced (even count) per card.
- R-WS-12 (MUST) palette roles reference defined palette keys; role
applicability is mode-dependent — block uses bg/fg/accent; overlay
ignores bg (the illustration fills it) and uses fg/accent for the
scrim-overlaid text. A bg on an overlay card is accepted but inert (SHOULD
warn).
- R-WS-13 (SHOULD) the last card is the mono URL closer (a strong convention,
warned not errored).
- R-WS-14 (MUST) cover bookend cards reference an available cover image at
render time (validated at assemble, not author, time).
3.2 Generation¶
keryx cover --scene "..." [--theme <article kw>] [--n N] [-o cover.png]
- In: scene prompt; article theme (style prefix + palette); sample count.
- Out: N PNG samples; prints paths.
- Side-effects: N image-provider calls (spend). Exit: 1 on provider error.
- Contracts: R-GEN-1 (MUST) resolves an article-type theme; R-GEN-2
(MUST) honours --n; R-GEN-3 (SHOULD) appends the hardened wordless
instruction so covers carry no text.
keryx portrait --ref photo.jpg [--ref ...] [--theme <kw>] [--n N] [-o]
- In: one or more reference photos; portrait theme; variants.
- Out: N stylised avatars. Side-effects: image-provider calls.
- Contracts: R-GEN-4 (MUST) --ref repeatable; R-GEN-5 (MUST) resolves a
portrait-type theme; avatar pipeline is independent of the reel pipeline.
keryx voice [--line N] [--text "..."] [--takes N] [--stability x] [--similarity y] [--theme <kw>] [-w|--out vo/]
- In: in workspace mode the narration is each card's vo field (distinct
from the on-screen text, §3.1) plus its per-card voice override; --text
is the standalone single-line path; reel-theme voice settings are the
default, overridable per line by flags.
- Out: mp3(s) into vo/takes/ (workspace) or --out (standalone); prints
durations.
- Side-effects: TTS calls (cached per vo-text + settings).
- Contracts: R-GEN-6 (MUST) --line N regenerates exactly one line, leaving
others untouched; R-GEN-7 (MUST) --takes N produces N candidates;
R-GEN-8 (MUST) unchanged vo-text + settings → cache hit, no re-spend
(R-GLOBAL-9); R-GEN-24 (MUST) the vo text is passed through to the provider
including control tags (ElevenLabs SSML <break>, phonetic spellings) — the
batch used <break time="0.8s"/> for a beat; R-GEN-25 (SHOULD) a per-line
--stability/--similarity is persisted to that card's voice override
so a later re-build/--force reproduces it (the batch bumped single lines to
0.74–0.76). Note: ElevenLabs' pronunciation dictionary is a gated Studio
feature (no API) — the available levers are higher stability + phonetic
spelling in the vo text.
keryx voice select <line> <take> (workspace)
- Purpose: promote a candidate take to the selected slot.
- Out: vo/NN.mp3 ← vo/takes/NN-T.mp3. Side-effects: workspace write only.
- Contracts: R-GEN-9 (MUST) selection updates only that line's selected slot
and is recorded so re-assembly uses it; replaces the manual cp step.
keryx music [--prompt "..."] [--length <dur>] [--takes N] [--theme <kw>] [-w|--out bed.mp3]
- In: prompt (else theme's music tone); length (else auto from the VO-driven
total, 0001 §3.1); takes.
- Out: bed mp3(s). Side-effects: music-provider calls.
- Contracts: R-GEN-10 (MUST) in workspace mode with no --length, length is
computed from the assembled timeline; R-GEN-11 (MUST) --takes/select
parallel to voice. keryx music select <take> promotes the bed.
keryx cards [--card N] [--takes N] [--video] [--theme <kw>] (workspace)
- Purpose: generate per-card overlay media from each card's scene — a still
by default, or a short video with --video (video provider, §3.4).
- In: reel theme illustration style; optional single --card; takes.
- Out: candidates into cards/takes/ (.png or, for video, a clip).
- Side-effects: image- or video-provider calls with the hardened wordless
prompt.
- Contracts: R-GEN-12 (MUST) only overlay cards with a scene are
generated; R-GEN-13 (MUST) --card N re-rolls one card; R-GEN-14 (SHOULD)
candidates are flagged for likely text-leak (see sheet); R-GEN-21 (MUST,
deferred — Phase 5) --video routes to the video provider and the renderer
fits the clip to the card's VO-driven duration (§3.1). The seam/schema/UI exist
from the start (the "video-shaped hole"); generated-video + video compositing
are built last. Uploaded video (R-GEN-22) works without this. Full feature
spec: 0003-video-panels.md (R-VID-*).
keryx cards set <card> <file> — use pre-rendered media instead of AI
- Purpose: assign a user-supplied image or video to a card, bypassing
generation (the editor file picker, R-UI-29, does the same).
- Out: copies the file into the workspace and sets the card's media
(source: uploaded, kind by file type); marks it selected.
- Contracts: R-GEN-22 (MUST) accepts image and video files, validates type/
dimensions/aspect (warns on non-9:16), and uses it as-is — no AI call,
no text-leak screen (the user owns the media); R-GEN-23 (MUST) an uploaded
card needs no scene and is not overwritten by a later keryx cards run
unless --force.
keryx cards select <card> <take> / keryx cards sheet [-o sheet.png]
- select: promote a clean generated take to the card's media slot.
- sheet: render a contact sheet of current card media for a one-look text-leak
screen. Contracts: R-GEN-15 (MUST) sheet shows every card with its index;
R-GEN-16 (MUST) select sets the slot assembly consumes; R-GEN-26 (SHOULD)
keryx auto-screens candidates for text leaks via OCR — flag/reject any
with detected glyphs (the batch caught these by eye), the contact sheet being
the human fallback; R-GEN-27 (MUST) the scene sent to the provider is
pure visual description — directive/metaphor clauses leak as captions (§3.1).
keryx reel build [--storyboard board.json --cover cover.png | -w <slug>] [--silent] [--only-line N] [--force] [-o reel.mp4]
- Purpose: assemble the 9:16 reel. (reel is the lifecycle noun group, §3.1 —
build is the assemble verb, so bare keryx reel is never an action.)
- In: storyboard + cover + selected vo/music/cards (workspace), or explicit
paths (standalone); reel theme.
- Out: reel.mp4; prints duration + dimensions.
- Side-effects: ffmpeg render (no provider spend; consumes existing assets).
- Exit: 2 on invalid storyboard; 1 on ffmpeg failure.
- Contracts: R-GEN-17 (MUST) --silent renders dur-timed, audio-free output
(no VO/music needed) for the fast proof; R-GEN-18 (MUST) output is 1080×1920
H.264+AAC and duration ≈ Σ(VO+lead+tail) − xfades (probe-tested, §8);
R-GEN-19 (MUST) --only-line N re-assembles reusing unchanged takes;
R-GEN-20 (MUST) the final card is the dedicated URL closer; orphan-controlled
wrapping applies.
keryx social [-w <slug>] [--platform <p>] (was caption)
- Purpose: compose the social elements for the reel — supporting text,
hashtags, link, title — that posting consumes (PostMeta, 0001 §4.1).
- In: the reel's storyboard / source (and associated bundle, §3.1) for context;
the LLM provider; --platform to target one platform.
- Out: a per-platform social set persisted in the workspace (e.g. social.json)
and printed; --platform emits just that platform's set.
- Contracts: R-CAP-1 (MUST) produces a base set (caption, tags, link, title)
and, per --platform, a platform-appropriate variant; R-CAP-2 (MUST)
each variant respects that platform's constraints (§4.4) — caption length,
hashtag count/style, link handling (clickable vs "link in bio"), title vs
description — and flags violations; R-CAP-3 (SHOULD) the studio Publish panel
(R-UI-25) reads/writes the same social.json; R-CAP-4 (MUST) on a reel with
an associated bundle, also writes a human-readable caption file
(reel-caption.md, per-platform prose) into the bundle for manual/
cross-check posting — an explicit batch requirement, a first-class committed
deliverable beside reel-<slug>.mp4 (social.json stays the machine form in
the workspace).
3.3 Themes¶
keryx theme list|show|add|edit|rm (per 0001 §6.3)
- Contracts: R-THEME-1 (MUST) list [--type] shows the catalog grouped by
type; R-THEME-2 (MUST) add --type <t> [--from <kw>] creates/clones and
persists via GTB config; R-THEME-3 (MUST) keyword is unique within a type
(same keyword across types is allowed); R-THEME-4 (MUST) edit round-trips
(write then show reflects it); R-THEME-5 (SHOULD) rm of a theme that is a
defaults.<type> is refused or reassigns the default with confirmation.
3.4 Posting & approval¶
Each platform's social record (§4.4) holds a status draft → approved →
posted, an optional scheduled_at, and (once posted) posted_at + post_url.
This record is the idempotency ledger.
keryx approve <platform|all> [-w <slug>] [--at <datetime>] [--revoke]
- Purpose: the safety gate — mark a platform's social set ready to post, and
optionally schedule it.
- Out: sets status approved (+ scheduled_at if --at); --revoke returns
it to draft. Persisted (a git commit, §1.5).
- Contracts: R-POST-9 (MUST) approval is required before any post; R-POST-10
(MUST) approving a platform whose social fails its constraints (§4.4) is refused
with the violations; R-POST-11 (SHOULD) --at sets the per-outlet schedule
the unattended runner consumes.
keryx post <platform> <file|‑w> ... [--dry-run] [--force]
- In: video (or workspace reel.mp4); the platform's social set; token.
- Out: post id/URL; updates the record to posted + posted_at + post_url.
- Exit: 2 if not authed or not approved; 1 on API failure; 0 on
publish / --dry-run pass.
- Contracts: R-POST-1 (MUST) --dry-run validates auth + video vs the
platform's limits and posts nothing; R-POST-2 (MUST) refuses unless
status is approved (the accidental-posting guard); R-POST-3 (MUST) a
posted platform is a no-op (idempotent) unless --force; on success sets
posted + posted_at + post_url; R-POST-4 (SHOULD) bounded retry/backoff;
rate limits respected. On-demand posting is available from CLI and the studio
(R-UI-27); unattended posting is CI-only.
keryx post all [-w <slug>|<file>] [--dry-run] (one reel, all platforms)
- Contracts: R-POST-5 (MUST) posts each enabled + approved platform
independently; R-POST-6 (MUST) records each success even if another fails;
R-POST-7 (MUST) exits non-zero if any failed, per-platform results in
--json; R-POST-8 (MUST) resumable/idempotent (skips posted).
keryx post due [-w <slug>] — the CI / scheduled entrypoint
- Purpose: the unattended path. Without -w, scans all of the project's
reels; with -w, one reel. Posts only approved platforms whose
scheduled_at is due.
- Contracts: R-POST-12 (MUST) posts exactly the approved + due platforms
across the targeted scope and is idempotent (R-POST-3); R-POST-13 (MUST)
is the only command CI runs unattended; a not-yet-due or unapproved platform is
skipped (not an error).
3.5 Auth & tokens¶
keryx auth <platform>
- Purpose: interactive OAuth capture (local browser).
- Out: token stored via GTB keychain; prints success + expiry.
- Contracts: R-AUTH-1 (MUST) interactive only (not for CI); R-AUTH-2 (MUST)
stored secret never printed/logged (GTB redaction).
keryx auth refresh [--platform <p>] [--dry-run] (was refresh-tokens)
- Purpose: CI job — refresh/rotate tokens and write back to the writable
store (0001 §4.2).
- Contracts: R-AUTH-3 (MUST) persists a rotated refresh token immediately
(TikTok rotates every use); R-AUTH-4 (MUST) on failure exits non-zero and
alerts (silent staleness is the top risk); R-AUTH-5 (MUST) --dry-run
reports token ages/expiries without mutating.
3.6 Tool / GTB defaults¶
init, doctor, config, update, docs, changelog, keychain come from
GTB. keryx extends two:
- R-TOOL-1 (MUST) keryx init seeds the theme catalog + default config in the
owning project (0001 §6.2).
- R-TOOL-2 (MUST) keryx doctor checks ffmpeg/ffprobe + fonts present and
versioned, provider credentials resolve, and each enabled platform token is
non-stale; exits non-zero if a required dependency is missing (gates CI).
4. Web UI — functionality, UX, and API contract¶
The studio (0001 §10) is a local, single-user web app started by
keryx studio. It is a richer front-end over the same workspace; anything
it does maps to a CLI operation on that workspace.
4.1 Functional requirements¶
v1 (required) — author & adjust:
- R-UI-28 (MUST) Project switcher — the studio's top level is a project
picker (a user has many projects). A project may be a local directory or a
remote git repo (§1.5); switching reloads that project's reels/config/themes.
The CLI has no switcher (it scopes to the cwd, 0001 §3.5).
- R-UI-30 (SHOULD) Project settings — a Settings panel (project-scoped) to
view/edit themes, AI providers (providers.{image,video,voice,music,render}),
platform enablement, and defaults. It writes the project's .keryx.yaml
(R-CFG-1), rejects invalid writes (R-CFG-4), never writes secrets
(R-CFG-2 — credentials are managed separately), and changes take effect via
hot reload (R-CFG-3). Reached from the project/library level (and the ⋯
overflow on mobile).
- R-UI-24 (MUST) Reels library (CRUD) — within a project the studio opens
on a list of its reels, not a single reel. Create / open / duplicate / rename
/ delete, each row showing status (draft → approved → posted) and its associated
content. The single-reel editor is the drill-in. Mirrors
keryx reel list|open|rename|duplicate|rm (R-WS-16..18); UI↔CLI parity.
- R-UI-1 (MUST) Open/select a reel workspace and load its storyboard.json.
- R-UI-2 (MUST) Compose the storyboard: add / remove / reorder cards;
edit per card: on-screen text and the separate vo narration (§3.1 —
the batch proved they differ; the vo field accepts SSML <break> and a
per-line voice stability override), palette roles, accent words, mode
(block/overlay), scene, cover/mono flags. The editor is mode-adaptive
(R-UI-17):
block shows bg/fg/accent; overlay shows fg/accent + scene +
the illustration slot and treats bg as inert (the image fills it) — per the
validation rules R-WS-12.
- R-UI-3 (MUST) Upload images: cover art and portrait reference photos,
saved into the workspace.
- R-UI-29 (MUST) Per-card media source — for an overlay card the editor
offers Generate (AI) or Upload (a file picker for a pre-rendered
image or video). Uploaded media is used as-is (no AI call, no text-leak
screen, R-GEN-22); generated video uses the video provider (R-GEN-21). A
card backed by uploaded media needs no scene. The card shows which source/
kind it has (still vs video, generated vs uploaded).
- R-UI-4 (MUST) Associated content / source (R-UI-18): an "Associated
content" panel that either links a content directory (a page bundle or any
directory, R-UI-26) or holds pasted source text. It feeds
storyboard draft and chat ("draft from this post") and records where the reel
belongs (§3.2). The seed text/context has a home, not only the chat box.
- R-UI-26 (MUST) Bundle association: link/unlink the reel to a directory;
the UI shows the dir's detected files (post markdown, cover, assets) and uses
them as read-only AI context; the finished reel write-back targets it. Any
directory is accepted (not Hugo-specific); optional.
- R-UI-5 (MUST) Chat to adjust: a streaming chat (GTB chat client) that
proposes storyboard edits — including structural ones (add / remove /
reorder cards), not just text (see R-UI-15 / R-API-2); the human
accepts/rejects — nothing auto-applies.
- R-UI-6 (MUST) Validation feedback: invalid storyboard state is shown
inline against the R-WS-9..14 rules and blocks save of broken JSON.
- R-UI-7 (MUST) Save writes storyboard.json (the same file the CLI uses).
- R-UI-23 (MUST) Unsaved-changes guard: edits are local until Save; the UI
shows a dirty indicator and warns before discarding on workspace/theme
switch, reload, or close. (Spotted building the mockup — nothing protected
local edits.)
v2 (optional / later) — preview & produce:
- R-UI-8 (MAY) Render + view the silent draft in-browser. Note: the v1
per-card preview is necessarily approximate for overlay cards (palette +
text + image placeholder); the faithful preview of an overlay card requires
its generated illustration, so true preview is this v2 silent draft.
- R-UI-9 (MAY) Generate/audition voice & music takes and select them.
- R-UI-10 (MAY) Generate/re-roll card illustrations, with the contact-sheet
text-leak screen.
- R-UI-11 (MAY) Trigger a full render and preview the reel.
- R-UI-12 (MAY) Review the posting status (social.json) / trigger posting
(likely stays CLI/CI).
- R-UI-19 (MAY) View / edit the generated social set in the studio.
- R-UI-25 (SHOULD) Publish / social composition (per platform) — a panel
to generate and edit each platform's social elements (supporting text,
hashtags, link, title) for Instagram / YouTube / TikTok / LinkedIn, that
steers the user to each platform's conventions: live character count vs the
cap, hashtag-count guidance, link-handling hint (clickable vs "link in bio"),
and title-vs-description where relevant (the §4.4 constraints). Reads/writes
social.json (keryx social, R-CAP-3).
- R-UI-27 (SHOULD) Approve, schedule & post on demand — per platform the
Publish panel shows the status (draft → approved → posted), an Approve
toggle (gating posting, R-POST-2; refused if constraints fail), an optional
schedule date/time (scheduled_at, R-SOC-7), and a Post now action
enabled only when approved. On-demand posting from the UI is human-initiated;
unattended posting remains CI-only (§4.3) and MCP posting tools stay gated
(R-MCP-2). Posting from the UI calls the same path as keryx post.
4.2 UX¶
CLI UX principles (cross-cutting): human-readable by default with --json
for scripts; long operations stream progress; spend/destructive actions confirm
unless --yes; errors carry an actionable hint; everything is workspace-relative
so a session is resumable. The CLI is the complete interface — the UI never
exposes anything the CLI can't do (R-UI-13, MUST: UI↔CLI parity on the
workspace).
Visual language. Minimalist and modern: a neutral light base (white / near-white panels, soft slate text, hairline borders), with teal and amber as restrained accents only — active states, primary actions, links (teal), and highlights / accent words / key CTAs (amber). Not large blocks of brand colour; the palette earns its place in the reel, the tool chrome stays quiet. Dark mode is a nice-to-have, not required.
Responsive & mobile-first (R-UI-20, MUST). The studio MUST be usable on a
phone — the primary device in practice. The top bar collapses to
essentials on narrow screens (back · reel name · validation · Save), moving
secondary controls (project switcher, theme) into an overflow menu and
dropping the Edit/Publish segment in favour of the bottom tabs — it must not
overflow/scroll horizontally on a ~400px device. Layout adapts by width:
- Desktop (wide): three panes — card list · editor · chat — with the card
list and chat panes collapsible (R-UI-21, MUST) to give the editor room on
smaller displays; collapsed state persists.
- Mobile (narrow): a single column with bottom tab navigation between
Cards / Editor / Chat (and Source / Assets); the chat is a first-class
tab because on mobile it is the primary authoring surface (propose→accept is
the main interaction). Touch targets ≥44px; card reorder works by touch
(handle drag or up/down controls).
Three levels. (1) Reels library (the home — R-UI-24), (2) the reel
editor (storyboard authoring), (3) the Publish panel (per-platform social,
R-UI-25). The library is the entry; a reel drills into editor + publish (a
tab/segment within the reel).
Library (home) Reel editor (desktop, panes collapsible)
┌─────────────────────────┐ ┌──────────────────────────────────────────┐
│ Reels [+ new] │ │ ‹reels everyone-wants-rust ▾ [Edit|Publish]│
│ ┌─────────────────────┐ │ ├────────┬───────────────────────┬─────────┤
│ │ everyone-wants-rust │ │ │ CARDS◧ │ EDITOR │ CHAT ◨ │
│ │ ● approved · 4 posts│ │ │ 1 hook │ text / accent │ propose │
│ │ ↳ content/post/… │ │ │ 2 … │ mode ▸ overlay|block │ diff │
│ ├─────────────────────┤ │ │ 3 …◀ │ scene + image (ovl) │ [✓][✗] │
│ │ switched-it-off │ │ │ +add │ fg/accent · flags │ │
│ │ ○ draft │ │ ├────────┴───────────────────────┴─────────┤
│ │ open dup rename ⋮│ │ │ Associated: content/post/… · assets │
│ └─────────────────────┘ │ └──────────────────────────────────────────┘
└─────────────────────────┘ Publish panel: per-platform tabs (IG/YT/TT/LI),
Mobile: same list; tap a each: supporting text + live char-count vs cap,
reel → editor (bottom tabs hashtags, link (clickable | link-in-bio), title.
Cards/Editor/Chat/Publish).
R-UI-14 (MUST) Reordering cards updates their order in storyboard.json.
A drag-and-drop handle is the primary method — a generously sized
(≥44px) grab handle per card, working with touch and pointer (the small up/down
arrows were poor touch targets on mobile). A keyboard-accessible move
(focus handle + arrow keys, or a menu action) is required for a11y so reorder
isn't drag-only.
- R-UI-15 (MUST) The chat proposes a storyboard patch (one or more ops:
edit-text, set-field, add/remove/reorder card) shown as a reviewable diff;
accept applies it to the working storyboard (still unsaved until Save),
reject discards — the human gate is explicit and visible. Multi-op patches are
accept/reject as a unit (per-op granularity MAY come later).
- R-UI-16 (SHOULD) The card list shows each card's mode + whether its
illustration/VO take is present/selected (asset status at a glance).
- R-UI-22 (SHOULD) Theme switch while editing re-applies the theme's card
treatment to previews and flags that existing card illustrations were generated
in the previous theme's style (offer to re-roll), rather than silently
mismatching.
4.3 HTTP API contract (illustrative)¶
The frontend talks to a thin keryx HTTP API over the workspace (served by GTB
pkg/http). Endpoints map 1:1 to CLI operations; both are tested against the
same workspace assertions.
| Method · path | CLI analogue | Notes |
|---|---|---|
GET /api/projects · POST .../switch |
(cwd; studio-only) | list/switch projects, incl. remote git (R-UI-28, R-GIT-2) |
GET·PUT /api/config |
edits .keryx.yaml |
Settings panel; 422 on invalid; no secrets (R-UI-30, R-CFG-2..4) |
GET /api/reels |
reel list |
project reels + status + association (library, R-UI-24) |
POST /api/reels |
reel new |
create; {slug, theme, bundle?} |
POST /api/reels/{slug}:duplicate · :rename |
reel duplicate/rename |
CRUD |
DELETE /api/reels/{slug} |
reel rm |
guarded |
GET /api/workspace/{slug} |
(read workspace) | storyboard + asset/social status |
PUT /api/workspace/{slug}/storyboard |
save storyboard | validated; 422 on schema error |
POST /api/workspace/{slug}/assets |
(upload) | cover / ref image upload |
PUT /api/workspace/{slug}/bundle |
reel link |
associate a content dir (R-UI-26) |
PUT /api/workspace/{slug}/source |
(hold source text) | pasted source feeding draft/chat |
POST /api/workspace/{slug}/chat (SSE) |
storyboard draft + edits |
streams tokens + a proposed patch (ops) |
GET·PUT /api/workspace/{slug}/social |
social |
per-platform social set (R-UI-25) |
POST /api/workspace/{slug}/social/{platform}:approve |
approve |
set approved + schedule (R-UI-27) |
POST /api/workspace/{slug}/social/{platform}:post |
post |
on-demand; requires approved (R-POST-2) |
POST /api/workspace/{slug}/generate |
voice/music/cards/reel build |
async; {target,...}; v2 |
GET /api/workspace/{slug}/preview/silent |
reel --silent |
v2 |
GET /healthz |
(GTB) | liveness |
R-API-1(MUST)PUT storyboardrejects schema-invalid bodies with422and field-level errors, and on success writes the samestoryboard.jsonthe CLI reads.R-API-2(MUST) the chat endpoint never auto-writes the storyboard; it returns a proposed edit the client must confirm (mirrorsR-UI-15).R-API-3(MUST) the server binds localhost by default and is not exposed without explicit opt-in + auth (0001 §10.2).R-API-4(SHOULD) the API is versioned (/api/v1/...).R-API-5(MUST) the chat patch is a structured set of storyboard ops (edit-text / set-field / add / remove / reorder) the client validates and applies on accept — never applied server-side unconfirmed (mirrorsR-UI-15).R-API-6(MUST) a patch is generated against a storyboard revision; on accept the client rebases/validates it against the current working state and rejects (re-proposes) if it no longer applies cleanly — a stale proposal must not silently corrupt the storyboard.
4.4 Per-platform social constraints (steering reference)¶
The Publish panel (R-UI-25) and keryx social --platform (R-CAP-2) steer
the user to each platform's norms. The values below are guidance to encode +
verify against live docs, not gospel — they drive the UI's char-counts,
hashtag hints, and link handling. (mid-2026; confirm before relying on numerics.)
| Platform | Primary text | Hashtags (norm) | Links | Title |
|---|---|---|---|---|
| caption ~2,200 chars | several–~10 (cap 30) | not clickable → "link in bio" | — | |
| YouTube (Shorts) | description (long) | a few, in description | clickable in description | title required (~100 chars) |
| TikTok | caption (short, ~2,200 but front-load) | a few, trend-led | limited clickability | — |
| post text (front-load ~140) | sparing (1–3) | clickable; first link unfurls | — |
R-SOC-1(MUST) the UI shows a live character count vs the platform cap and flags overflow before posting.R-SOC-2(MUST) link handling is surfaced per platform (clickable vs "link in bio") so the caption is composed accordingly.R-SOC-3(SHOULD) hashtag guidance reflects the platform norm (e.g. warn on LinkedIn hashtag spam, allow more on Instagram).R-SOC-4(SHOULD) AI generation produces a platform-appropriate draft (tone + shape), not one caption copy-pasted everywhere.R-SOC-5(SHOULD) constraints live in config (per-platform), so they can be tuned as platforms change without a rebuild — consistent with keryx's config-driven posture.
Social record (social.json) — per platform:
{ "instagram": {
"text": "…", "hashtags": ["#rust"], "link": "https://…", "title": "",
"status": "draft", // draft → approved → posted (R-POST-2 gate)
"scheduled_at": null, // per-outlet date/time the scheduler consumes
"posted_at": null, "post_url": null // set on success; the idempotency record
}, "youtube": { … }, "tiktok": { … }, "linkedin": { … } }
R-SOC-6 (MUST) status is per platform (draft|approved|posted); only
approved may post (R-POST-2); posted carries posted_at+post_url and is
the idempotency record. R-SOC-7 (MUST) scheduled_at is per platform and is
what the unattended runner consumes (keryx post due, R-POST-12).
R-SOC-8 (SHOULD) editing an approved/posted platform's text drops it back
to draft (re-approval required) so a posted variant can't drift silently.
5. MCP interface (GTB mcp feature)¶
keryx mcp runs an MCP server (GTB-provided) that exposes keryx's commands as
MCP tools so an AI assistant can drive keryx programmatically — drafting
storyboards, running takes/select, assembling, composing social. This is what
"enabling MCP aids the contract promises" means: the MCP tool schemas are
derived from the same command definitions as the CLI (§3), so there is one
contract enforced across all three surfaces — no second, drifting definition.
R-MCP-1(MUST) Each exposed command maps to one MCP tool whose input schema reflects that command's args/flags (§3); the tool's behaviour and the CLI's MUST be the same operation on the same workspace.R-MCP-2(MUST) Tool exposure is tiered by side-effect. Read/author/ generate tools (storyboard draft,reel new,voice/music/cardsand theirselect,reel build,social,theme list/show) are exposed by default. Tools that publish or spend irreversibly or touch secrets (post,post all,post due,approve,auth,auth refresh) are gated — opt-in via config and off by default — so an assistant cannot post publicly or rotate tokens without explicit enablement.R-MCP-3(MUST) Tools that spend credit or write default to a dry-run / confirm posture where the protocol allows, mirroring--dry-run(R-GLOBAL-5); destructive effects require explicit, non-default invocation.R-MCP-4(SHOULD) Tool results are the structured (--json) form of the command output (R-GLOBAL-3), so an assistant gets machine-readable results.R-MCP-5(MUST) The MCP server respects the same config, themes, providers, and workspace resolution as the CLI; it introduces no bypass of validation (R-GLOBAL-6) or the idempotency ledger (R-POST-3).R-MCP-6(SHOULD) Transport is local (stdio) for an assistant on the same host; any networked exposure is opt-in and authenticated (cf. the studio, 0001 §10.2).
This makes the studio's chat (§4.1 R-UI-5) and an external assistant two
clients of the same MCP-able operations, and keeps the propose→confirm human
gate intact: an assistant may propose and generate, but publishing stays
gated and human-driven.
6. Traceability¶
Each R-* requirement maps to at least one godog scenario or unit test (0001
§8). The high-value contract scenarios to cover first: the authoring loop
(reel new → storyboard draft → reel build --silent → voice --takes/select →
cards/select → reel build → social), post/post due (success / partial /
idempotent retry / not-due-skip / --dry-run), auth refresh rotation + alert-on-failure, theme
add/edit/resolve, storyboard schema validation (CLI exit 2 and API 422), the
studio chat propose→confirm gate, and MCP tool exposure tiers (a gated tool
like post is not callable without opt-in, R-MCP-2).