Skip to content

0010 — auth refresh (token rotation, expiry alerting)

Status: IN PROGRESS (Phase 1 built & unit-tested 2026-06-21: the Refresher seam, per-platform refreshers, pluggable WriteBack + Notifier backends, and keryx auth refresh [--dry-run]. Phase 2 — the gitlab write-back backend + scheduled pipeline — pending CI wiring.) Date: 2026-06-21 Depends on: 0001 §4.2 (the infra wrinkle); 0002 §5 (R-AUTH-3/4); the four shipped publishers (Instagram, YouTube, TikTok, LinkedIn) — this generalises their token handling. Now that all four token shapes exist, auth refresh can cover every case.

1. Goal & scope

Keep unattended posting alive. A keryx auth refresh [platform|all] command + per-platform refresh logic that, run on a schedule well inside each token's window, refreshes/rotates tokens where possible and alerts where it can't — because a silently stale token is the #1 unattended failure mode (0001 §4.2).

In scope: the refresh seam + per-platform refreshers, token expiry metadata, a writable token store that also works in CI, alerting, the auth refresh command (+ --dry-run status), and the scheduling contract. Out of scope: changing the posting flow; new platforms.

2. The four shapes → what refresh does per platform

Platform Token shape refresh action Failure / wall
Instagram long-lived ~60d access refresh-in-place (ig_refresh_token) when aging; store new 60d token alert if refresh fails / token >60d (dead)
YouTube durable refresh token health-check: mint an access token; nothing to persist alert if refresh token revoked/invalid
TikTok rotating refresh token rotate: exchange → new access + new refresh; persist the new refresh immediately alert if refresh fails (old refresh dead)
LinkedIn 60d access, no refresh (standard) none possible → compute days-to-expiry; if MDP, refresh like YouTube/TikTok alert: re-auth required by DATE

So refresh is not one algorithm — it's a per-platform strategy behind a common seam, plus a uniform expiry/alert layer.

3. Architecture — the Refresher seam

Mirror the publish.Publisher registry with an auth.Refresher each platform implements (in its existing internal/publish/<platform> package, which already owns the token logic) and self-registers:

type Status struct {
    Platform      string
    Action        string    // "rotated" | "refreshed" | "healthy" | "reauth-required" | "skipped"
    ExpiresAt     time.Time // access/usable token expiry (zero if unknown)
    ReauthBy      time.Time // when interactive re-auth is needed (non-refreshable)
    Alert         bool      // surfaced to the alert channel + non-zero exit
    Detail        string
}

type Refresher interface {
    Name() string
    Refresh(ctx context.Context, dryRun bool) (Status, error)
}

auth refresh all fans out across enabled platforms (mirrors post all): runs each independently, accumulates Status, reports per-platform, and exits non-zero if any has Alert (so CI surfaces it). --dry-run computes + reports status (valid-until, refreshable?, action it would take) and changes nothing.

4. Token metadata (expiry tracking)

Refresh/alert decisions need each token's expiry, so the adapters must persist it (some already do). Uniform per-platform config keys:

  • platforms.<p>.access_token_expires_at — RFC3339 (LinkedIn already; add to Instagram from expires_in).
  • platforms.<p>.refresh_token_expires_at — where the refresh token itself expires (TikTok refresh_expires_in ≈ 365d; LinkedIn-MDP).
  • YouTube needs none (durable). These are non-secret → config (not keychain).

Backfill: auth <platform> writes the relevant expiry at capture; auth refresh updates it on each successful refresh. A token captured before this lands has no expiry → treated as "unknown", refreshed proactively (idempotent) or flagged.

5. Writable store in CI (pluggable write-back backend)

Locally, refreshed tokens persist via the existing oauth.Store (keychain → config). In CI there's no keychain and the job can't natively write back to the GitLab CI variables it read from — yet TikTok rotation must persist or the next run is locked out.

This is a pluggable backend, config-selected — same philosophy as keryx's generation providers (0001 §3.4, providers.{image,voice,music,render}): a narrow interface, the implementation chosen from user config at construction, adding one is purely additive (implement + register a constructor, no call-site changes). So each owning project picks the write-back that suits its infra.

type WriteBack interface { Save(ctx context.Context, key, secret string) error }

Config: auth.writeback.backend: local | gitlab | <future> (+ backend sub-config). Seeded backends:

  • local (default) — oauth.Store (keychain/config). For CLI use.
  • gitlabPUT /projects/:id/variables/:key via the GitLab API with a project access token (api scope), updating the same masked variables the adapters read (TIKTOK_REFRESH_TOKEN, …). For the scheduled job.
  • (additive, no core change) — an external secret manager (Vault / cloud), or any backend a user contributes; registered behind the same interface.

Phase 1 ships the interface + factory + local (testable now). The gitlab backend is Phase 2 (can't be fully tested until CI is wired) but designed now.

6. Alerting (R-AUTH-4) — pluggable notifier backend

Any Alert status (refresh failed, or a non-refreshable token nearing expiry) must reach a human, since the scheduled job is unattended. Like write-back (§5), the notifier is a config-selected pluggable backend — a narrow interface, chosen from config, additive:

type Notifier interface { Notify(ctx context.Context, msg string) error }
  • Non-zero exit is the always-on baseline (the CI pipeline goes red) regardless of notifier.
  • Config: auth.alerts.backend: none | webhook | <future> (+ sub-config). Seeded backends:
  • none (default) — rely on the red pipeline only.
  • webhook — POST a short message ("LinkedIn token expires in 9 days — run keryx auth linkedin") to a Slack/Teams incoming webhook (auth.alerts.webhook.url / ALERT_WEBHOOK_URL).
  • (additive) — email/SMTP, a GTB notification primitive, PagerDuty, etc., contributed behind the same interface. Reuse a GTB notifier if one fits.

7. Command surface

keryx auth refresh [<platform>|all] [--dry-run]
  • auth refresh all — every enabled platform (the CI entrypoint).
  • auth refresh tiktok — one platform.
  • --dry-run — report token health + the action that would be taken; no writes.
  • Scaffolded with gtb generate command --name refresh --parent auth (manifest-tracked). It is side-effecting (writes tokens)gated off the MCP surface like post/approve/auth (setup.ExcludeFromMCP, build-time).

8. Scheduling (owning project)

keryx is stateless — the schedule lives in the blog repo's CI (0001 §3.2). A scheduled pipeline runs keryx auth refresh all with the write-back backend set to gitlab. Cadence: weekly is comfortable — well inside IG/LinkedIn's 60-day windows and TikTok's 365-day refresh wall, keeps IG fresh, rotates TikTok, and gives ~7 weeks of LinkedIn re-auth warning before expiry. Documented as a how-to; keryx ships no schedule.

9. Contracts honoured

R-AUTH-3 (persist the rotated/refreshed token immediately — TikTok), R-AUTH-4 (refresh failure / impending expiry alerts), and the post-side resilience (R-POST-*) that a fresh token underpins.

10. Testing

TDD: per-platform Refresh against a faked httpDoer (refresh success → token persisted; rotation persisted; failure → Alert; near-expiry compute → reauth-required). A faked WriteBack asserts the rotated secret is saved (no live keychain). A faked Notifier asserts alerts fire. --dry-run writes nothing. The GitLab write-back backend is env-gated integration-tested (Phase 2). A godog scenario covers keryx auth refresh all --dry-run reporting each platform's status.

11. Questions

Resolved in review (2026-06-21):

  1. Write-back is a pluggable, config-selected backend (WriteBack interface; auth.writeback.backend) — not a single hardcoded mechanism — so each project chooses (local / gitlab / secret-manager / contributed). Mirrors the generation-provider pattern. See §5. (gitlab impl is Phase 2.)
  2. Alerting is a pluggable, config-selected backend (Notifier interface; auth.alerts.backend) — none / webhook / future — plus the always-on non-zero exit. See §6.

Recommended defaults (confirm or adjust):

  1. Refresh trigger. Rotate TikTok every run (keeps it alive cheaply); IG refresh when >24h old; YouTube health-check every run; LinkedIn alert when <14 days to expiry. (Per-platform thresholds in the refreshers.)
  2. Command placement. auth refresh [platform|all] subcommand (discoverable under auth).

12. Phased plan

  1. (review this spec; resolve §11) →
  2. Phase 1 (testable now): the auth.Refresher seam + registry; per-platform Refresh (TikTok rotate, YouTube health-check, IG refresh-in-place, LinkedIn expiry-alert); expiry metadata + IG backfill; local write-back; the Notifier seam; keryx auth refresh [--dry-run]; MCP-gate it. Validate live against the existing TikTok/YouTube/IG tokens.
  3. Phase 2 (when wiring CI): the gitlab write-back backend; the scheduled pipeline how-to in the blog repo; live unattended dry-run then real run. ```