audit-design-token-drift

A published Claude Skill for the Thios design system
Catches the design-system bug that compounds silently — a hex value diverging between the spec, the bridge, the code, and the demo. Empirically benchmarked. See the bench.
How it works at a glance
DESIGN.md
tokens.json
main.css
design-system.html
Skill
audit-design-token-drift
DESIGN_TOKEN_AUDIT_YYYY-MM.md
dated · file:line citations · per-finding fix

Four canonical surfaces in. One dated drift report out.

The process in four steps
1

extract

Pull token values from each surface — DESIGN.md hex codes, tokens.json $values, main.css :root, design-system.html :root.

2

diff

Four comparisons in leverage order: intra-DESIGN.md → DESIGN.md vs tokens.json → tokens.json vs main.css → main.css vs design-system.html.

3
bench-driven addition

verify final values

For every token DESIGN.md ascribes a hex to, resolve its tokens.json reference chain to a final hex and assert equality. Catches drift hidden inside primitive→semantic indirection — the class the four-diff pass alone missed.

4

report

Write a dated audit at _agents/DESIGN_TOKEN_AUDIT_YYYY-MM.md. Severity-tagged findings, file:line citations, one concrete fix per finding, Loop-step routing for each.

Benchmarked across 12 pre-registered trials

Tested against a one-line prompt, a vague baseline, and a lean variant of itself. Honest result published: the lean variant (with rhetorical sections removed) outperformed the full skill on recall, including catching a value-drift finding the full version missed 0 of 3 times. Empirical commitment, not slogan.

Full methodology, per-trial outputs, scoring rubric → thios.co/design-system-bench.html

Notes

Follows the addyosmani/agent-skills anatomy. A Thios-specific instantiation of a general pattern — rule-layer ↔ implementation-layer drift detection — that other vanilla design systems could adapt.

Intentionally specific to four surfaces. If Thios adds a Figma Variables export or an npm package per Figma's “Bring your design system” workflow, those become canonical surfaces 5 and 6 — update this skill before they go live.

For agents & deep readers
Full skill source — the SKILL.md that Claude actually reads

When to use

Run this skill when:

  • A change to any color, spacing, typography, or shadow token is about to ship
  • tokens.json $metadata.cssSyncRequired has any entries
  • A new sphere brand color is introduced
  • Weekly, on a schedule (cron / make audit-tokens), as drift-prevention infrastructure
  • Before publishing a design-system blog post or case study — you cannot claim “0 drift in production” without re-running this

Do NOT use for: component-level visual review (use audit-component-consistency), CSS minification verification (covered by make check-css-sync), WCAG contrast audits (use a dedicated a11y skill).

The four canonical surfaces

SurfacePathRole
DESIGN.md/DESIGN.mdHuman-readable spec; the rules layer
tokens.json/lib/styles/tokens.jsonW3C format; the bridge
main.css/lib/styles/main.cssRunning production code
design-system.html/design-system.htmlLive demo page

If a fifth canonical surface is added (e.g. tokens.figma.json exported from Tokens Studio), update this skill before adding it to the audit.

Extract token values from each surface

  • DESIGN.md: grep for hex codes (#[0-9A-Fa-f]{3,6}) in table rows and prose. Tag each finding with its section number so cross-section drift inside the same document is caught.
  • tokens.json: parse JSON. For each $value, resolve {primitive.x.y.z} references to a final hex. Build a flat name → final hex map.
  • main.css: extract the :root { ... } block. Capture every --token-name: value; declaration. Ignore @media-scoped overrides — those are intentional surface variations, not drift.
  • design-system.html: extract the :root { ... } block from its inline <style>. Same shape as main.css extraction.

The four diffs (leverage-ordered)

  1. DESIGN.md ↔ DESIGN.md (intra-document): does Section 2 agree with Section 9 and any prose mentions? Drift here is the worst — the rules layer cannot disagree with itself.
  2. DESIGN.md ↔ tokens.json: every named token in DESIGN.md must exist in tokens.json with the same value. Sphere brand colors are the most common drift here.
  3. tokens.json ↔ main.css: every primitive→semantic chain in tokens.json must resolve to the same hex declared in main.css. metadata.cssSyncRequired should be empty after this passes.
  4. main.css ↔ design-system.html: the live demo page must declare a superset of main.css tokens. Missing tokens here mean components rendered on design-system.html may silently fall back to defaults.

Step 3.5 — final-value verification

The four diffs check token existence but not value drift hidden inside primitive→semantic indirection. For every token DESIGN.md ascribes a hex value to:

  1. Look up the same name in tokens.json
  2. Resolve its {primitive.x.y.z} reference chain to a final hex
  3. Assert that final hex equals the value DESIGN.md states

Any mismatch is High-severity drift. Example (caught by audit-skill-bench as HIGH-1): DESIGN.md:79 says Auxosphere is #909090; tokens.json:192 says auxosphere → {primitive.color.gray.500} which resolves to #6c757d. Diff 2 passes; the value silently drifts. This step was added 2026-05-06 in response to the bench finding.

Output template

# Design Token Audit — YYYY-MM

## Summary
- Total drift findings: N
- Critical / High / Medium / Low: N each

## Findings
For each finding:
- Severity
- Location: file:line ↔ file:line
- Token name
- Values observed (with surface labels)
- Recommended fix (align value | add missing token | remove orphan)
- Loop step (per design-system-loop.html:
  1 Observe, 2 Design, 3 Tokenize, 4 Build, 5 Ship, 6 Document)

## Statistics
- Token counts per surface
- Coverage percentages

## Auditor / Date

Commit, link, verify

  • Commit the audit file under _agents/
  • Link it from any PR that touches tokens or main.css
  • If any Critical findings exist: the change must not ship until they're resolved or explicitly accepted in writing

Verification checklist before considering the skill run complete:

  • Audit file exists at _agents/DESIGN_TOKEN_AUDIT_YYYY-MM.md
  • Every finding has both a file:line citation and a one-line fix
  • tokens.json $metadata.cssSyncRequired reflects current state
  • If Critical findings: commit/PR/issue referenced in the audit
  • Previous month's audit compared — net drift count should trend down or be explained
Scope reminder. Findings should be limited to drift across the four canonical surfaces. Configurator-surface concerns, dark-mode palette gaps, accessibility audits, component spec critique, and broader architectural recommendations are out of scope — flag in a separate audit.