0010 — auth refresh (token rotation, expiry alerting)¶
Status: IN PROGRESS (Phase 1 built & unit-tested 2026-06-21: the Refresher seam, per-platform refreshers, pluggable WriteBack + Notifier backends, and keryx auth refresh [--dry-run]. Phase 2 — the gitlab write-back backend + scheduled pipeline — pending CI wiring.)
Date: 2026-06-21
Depends on: 0001 §4.2 (the infra wrinkle); 0002 §5 (R-AUTH-3/4); the four shipped
publishers (Instagram, YouTube, TikTok, LinkedIn) — this generalises their token
handling. Now that all four token shapes exist, auth refresh can cover every case.
1. Goal & scope¶
Keep unattended posting alive. A keryx auth refresh [platform|all] command +
per-platform refresh logic that, run on a schedule well inside each token's
window, refreshes/rotates tokens where possible and alerts where it can't —
because a silently stale token is the #1 unattended failure mode (0001 §4.2).
In scope: the refresh seam + per-platform refreshers, token expiry metadata,
a writable token store that also works in CI, alerting, the
auth refresh command (+ --dry-run status), and the scheduling contract. Out of
scope: changing the posting flow; new platforms.
2. The four shapes → what refresh does per platform¶
| Platform | Token shape | refresh action |
Failure / wall |
|---|---|---|---|
| long-lived ~60d access | refresh-in-place (ig_refresh_token) when aging; store new 60d token |
alert if refresh fails / token >60d (dead) | |
| YouTube | durable refresh token | health-check: mint an access token; nothing to persist | alert if refresh token revoked/invalid |
| TikTok | rotating refresh token | rotate: exchange → new access + new refresh; persist the new refresh immediately | alert if refresh fails (old refresh dead) |
| 60d access, no refresh (standard) | none possible → compute days-to-expiry; if MDP, refresh like YouTube/TikTok | alert: re-auth required by DATE |
So refresh is not one algorithm — it's a per-platform strategy behind a common seam, plus a uniform expiry/alert layer.
3. Architecture — the Refresher seam¶
Mirror the publish.Publisher registry with an auth.Refresher each platform
implements (in its existing internal/publish/<platform> package, which already
owns the token logic) and self-registers:
type Status struct {
Platform string
Action string // "rotated" | "refreshed" | "healthy" | "reauth-required" | "skipped"
ExpiresAt time.Time // access/usable token expiry (zero if unknown)
ReauthBy time.Time // when interactive re-auth is needed (non-refreshable)
Alert bool // surfaced to the alert channel + non-zero exit
Detail string
}
type Refresher interface {
Name() string
Refresh(ctx context.Context, dryRun bool) (Status, error)
}
auth refresh all fans out across enabled platforms (mirrors post all):
runs each independently, accumulates Status, reports per-platform, and exits
non-zero if any has Alert (so CI surfaces it). --dry-run computes + reports
status (valid-until, refreshable?, action it would take) and changes nothing.
4. Token metadata (expiry tracking)¶
Refresh/alert decisions need each token's expiry, so the adapters must persist it (some already do). Uniform per-platform config keys:
platforms.<p>.access_token_expires_at— RFC3339 (LinkedIn already; add to Instagram fromexpires_in).platforms.<p>.refresh_token_expires_at— where the refresh token itself expires (TikTokrefresh_expires_in≈ 365d; LinkedIn-MDP).- YouTube needs none (durable). These are non-secret → config (not keychain).
Backfill: auth <platform> writes the relevant expiry at capture; auth refresh
updates it on each successful refresh. A token captured before this lands has no
expiry → treated as "unknown", refreshed proactively (idempotent) or flagged.
5. Writable store in CI (pluggable write-back backend)¶
Locally, refreshed tokens persist via the existing oauth.Store (keychain →
config). In CI there's no keychain and the job can't natively write back to the
GitLab CI variables it read from — yet TikTok rotation must persist or the
next run is locked out.
This is a pluggable backend, config-selected — same philosophy as keryx's
generation providers (0001 §3.4, providers.{image,voice,music,render}): a narrow
interface, the implementation chosen from user config at construction, adding one
is purely additive (implement + register a constructor, no call-site changes). So
each owning project picks the write-back that suits its infra.
Config: auth.writeback.backend: local | gitlab | <future> (+ backend sub-config).
Seeded backends:
local(default) —oauth.Store(keychain/config). For CLI use.gitlab—PUT /projects/:id/variables/:keyvia the GitLab API with a project access token (apiscope), updating the same masked variables the adapters read (TIKTOK_REFRESH_TOKEN, …). For the scheduled job.- (additive, no core change) — an external secret manager (Vault / cloud), or any backend a user contributes; registered behind the same interface.
Phase 1 ships the interface + factory + local (testable now). The gitlab
backend is Phase 2 (can't be fully tested until CI is wired) but designed now.
6. Alerting (R-AUTH-4) — pluggable notifier backend¶
Any Alert status (refresh failed, or a non-refreshable token nearing expiry)
must reach a human, since the scheduled job is unattended. Like write-back (§5),
the notifier is a config-selected pluggable backend — a narrow interface,
chosen from config, additive:
- Non-zero exit is the always-on baseline (the CI pipeline goes red) regardless of notifier.
- Config:
auth.alerts.backend: none | webhook | <future>(+ sub-config). Seeded backends: none(default) — rely on the red pipeline only.webhook— POST a short message ("LinkedIn token expires in 9 days — runkeryx auth linkedin") to a Slack/Teams incoming webhook (auth.alerts.webhook.url/ALERT_WEBHOOK_URL).- (additive) — email/SMTP, a GTB notification primitive, PagerDuty, etc., contributed behind the same interface. Reuse a GTB notifier if one fits.
7. Command surface¶
auth refresh all— every enabled platform (the CI entrypoint).auth refresh tiktok— one platform.--dry-run— report token health + the action that would be taken; no writes.- Scaffolded with
gtb generate command --name refresh --parent auth(manifest-tracked). It is side-effecting (writes tokens) → gated off the MCP surface likepost/approve/auth(setup.ExcludeFromMCP, build-time).
8. Scheduling (owning project)¶
keryx is stateless — the schedule lives in the blog repo's CI (0001 §3.2). A
scheduled pipeline runs keryx auth refresh all with the write-back backend set
to gitlab. Cadence: weekly is comfortable — well inside IG/LinkedIn's 60-day
windows and TikTok's 365-day refresh wall, keeps IG fresh, rotates TikTok, and
gives ~7 weeks of LinkedIn re-auth warning before expiry. Documented as a how-to;
keryx ships no schedule.
9. Contracts honoured¶
R-AUTH-3 (persist the rotated/refreshed token immediately — TikTok), R-AUTH-4
(refresh failure / impending expiry alerts), and the post-side resilience
(R-POST-*) that a fresh token underpins.
10. Testing¶
TDD: per-platform Refresh against a faked httpDoer (refresh success → token
persisted; rotation persisted; failure → Alert; near-expiry compute →
reauth-required). A faked WriteBack asserts the rotated secret is saved (no
live keychain). A faked Notifier asserts alerts fire. --dry-run writes
nothing. The GitLab write-back backend is env-gated integration-tested (Phase 2).
A godog scenario covers keryx auth refresh all --dry-run reporting each
platform's status.
11. Questions¶
Resolved in review (2026-06-21):
- ✅ Write-back is a pluggable, config-selected backend (
WriteBackinterface;auth.writeback.backend) — not a single hardcoded mechanism — so each project chooses (local / gitlab / secret-manager / contributed). Mirrors the generation-provider pattern. See §5. (gitlabimpl is Phase 2.) - ✅ Alerting is a pluggable, config-selected backend (
Notifierinterface;auth.alerts.backend) —none/webhook/ future — plus the always-on non-zero exit. See §6.
Recommended defaults (confirm or adjust):
- Refresh trigger. Rotate TikTok every run (keeps it alive cheaply); IG refresh when >24h old; YouTube health-check every run; LinkedIn alert when <14 days to expiry. (Per-platform thresholds in the refreshers.)
- Command placement.
auth refresh [platform|all]subcommand (discoverable underauth).
12. Phased plan¶
- (review this spec; resolve §11) →
- Phase 1 (testable now): the
auth.Refresherseam + registry; per-platformRefresh(TikTok rotate, YouTube health-check, IG refresh-in-place, LinkedIn expiry-alert); expiry metadata + IG backfill;localwrite-back; theNotifierseam;keryx auth refresh [--dry-run]; MCP-gate it. Validate live against the existing TikTok/YouTube/IG tokens. - Phase 2 (when wiring CI): the
gitlabwrite-back backend; the scheduled pipeline how-to in the blog repo; live unattended dry-run then real run. ```