JH
Status: Building

Jonathan Hawkins/Builder of AI-native products

I turn frontier model capabilities into products people can actually use.

20+ years shipping games. Two-time Game of the Year winner on God of War and Eclipse: Edge of Light. Now solo-founding Aligned Tools at the edge of agentic AI.

01Featured work

Four case studies.

Projects that show the through-line: take a frontier capability, ship a product around it, fast.

01Case study

Aligned Tools

Your company's brain: listens to every meeting, remembers what your team decides, files the tickets they confirm.

Integrates with:JiraLinearAsanaMondayNotionGitHubZoomTeamsSlackGmailCalendar

Aligned is built to get smarter with age. It holds the patterns no one person can (recurring blockers, scattered expertise, decisions that keep resurfacing), so the longer a team uses it, the more their company remembers.

Problem
Engineering managers lose 30 to 60 minutes per meeting turning decisions into tickets, then spend the week reconciling the same task across Jira, Linear, Asana, Notion, and Monday. Action items slip, ownership is ambiguous, and the same decisions get re-litigated three sprints later because nobody remembers.
Constraint
Solo founder, B2B SaaS, in active enterprise sales. Must clear enterprise security review and slot into the customer's existing stack. Switching tools is off the table.
Move

Aligned listens across the surfaces work actually happens on (Zoom, Meet, Teams, Slack, Gmail, Calendar), then runs every transcript through an AI pipeline that extracts decisions, commitments, and ownership into a diff-style review UI where every human edit is captured.

The shape of the system:

  • Company Brain. A pgvector memory layer that reads across meetings, so Aligned can say “you decided this three sprints ago” before the debate starts over.
  • Agent frameworks in production. A CrewAI sprint planner for capacity-aware prioritization, rebuilt on a user-scoped Mem0 memory layer: CrewAI's default memory is process-local with no tenant partitioning, which leaks one customer's retrieval context into the next on shared FastAPI workers. CI greps every commit so nobody flips it back. Alongside it, a streaming email-triage pipeline and computer-use agents (OpenClaw + VNC) that drive full Ubuntu desktops: terminal, files, any GUI app.
  • Real-time voice agent. A LiveKit speech-to-speech agent (xAI Grok primary, OpenAI fallback) exposes ~60 backend tools, with an optional LemonSlice video-avatar integration. A fail-closed, multi-layer HMAC auth chain with nonce replay protection and a heartbeat watchdog (covering 75 of 86 voice routes) means a leaked service token alone can't impersonate a user.
  • Bidirectional ticket sync. Jira, Linear, Asana, Monday, Notion, and GitHub: per-provider mapping for project, status, priority, and issue-type, and a status that only reads “synced” when zero errors fired. Round-trip sync is 5× the work of write-only, but the customer keeps their tool of choice and Aligned becomes the connective tissue underneath.
  • SOC 2 CC7-style controls (not a certification). SSO-only, per-tenant isolation, AES-256-GCM versioned token encryption with key rotation, AWS Secrets Manager for per-tenant webhook secrets, hash-chain audit logs, fail-closed rate limiting, and dual TS+Python environment validators that block startup on missing or malformed config.
Outcome
MSA and two SOWs in negotiation with a name-brand AAA game studio after passing CIO-level security review on the first pass. In live tenants, meetings become human-confirmed Jira issues (name-to-ID resolution, undo, rollback) that trace back to the transcript that produced them, and tasks sync both ways across six providers. The WebSpatial build of Aligned won Best Technical Excellence at TechWeek 2026.
Claude APIMulti-provider syncBidirectional OAuthMCP serverNext.jsPostgresB2B SaaS
02Case study

VibeView

Spatial multi-agent orchestration for Claude Code

Problem
Run five Claude Code agents in parallel and the terminal turns to soup: output interleaves, work in progress is invisible until something breaks, and there's no shared view of who is doing what against the plan.
Constraint
28 hours, solo, at the SenseAI Hackademy. Cross-platform spatial UI from scratch.
Move
VibeView turns your editor into a room. Each Claude Code agent gets a floating glass window with live tmux output and parsed model/token/cost telemetry; a shared spatial kanban above them is fed by Claude Code's own ~/.claude/tasks/ files. Hold-to-talk voice routes commands to any agent, and when one transitions idle a Python bridge auto-summarizes its terminal with GPT-5-mini and speaks the summary back through a per-agent ElevenLabs voice. The hard call was WebSpatial. It ships to Apple Vision Pro, desktop browsers, and PICO 4 Ultra from one bundle, but the docs are sparse, so I had to reverse-engineer the renderer from runtime behavior.
Outcome
Shipped end-to-end in 28 hours, demoed live driving a 5-agent swarm against a real codebase. Won the WebSpatial track at SenseAI Hackademy. Open-sourced post-event.
WebSpatialClaude Code SDKVoice agentMulti-agentvisionOS
03Case study

SkillVault Desktop

Making Claude Code's invisible config visible

Problem
Claude Code's power lives in env vars, hooks, slash commands, and skills, but they're scattered across files, undocumented in-product, and impossible to compare across projects. New users hit a wall, and power users rebuild the same scaffolding for every project.
Constraint
Consumer-facing desktop app, must feel native on macOS, ship a marketplace from day one.
Move
SkillVault Desktop is a single pane that reads your Claude Code config across every project, surfaces what is actually wired up, and lets you install vetted skills/hooks/commands from a marketplace with one click. The part I sweated most was the install diff: when you install a skill, you see exactly which files change, with rollback. I built it on Tauri so it stays a 12 MB binary instead of a 200 MB Electron app, and the marketplace is just signed manifests in a public repo so anyone can publish without a backend.
Outcome
Public beta launched with 40+ community-contributed skills in the first week. Average user activates 6 skills in their first session.
TauriRustReactClaude CodeMarketplace
04Case study

Glassbox

A glass cockpit for a coding agent swarm, with scores it can't fake.

Problem
Agent swarms have two problems: you can't watch them work, and you can't trust what they tell you. Parallel output interleaves into noise, and when an agent claims its code is correct, usually nothing is checking.
Constraint
WeaveHacks 4: one weekend, solo, empty repo. One rule: no theater. The agents write the code themselves, every step shows on the board as it happens, and the score comes from building and running the thing. Nothing gated, nothing hardcoded.
Move

Glassbox puts a fixed crew (planner, coordinator, four workers, validator, improver) on a live tldraw board fed by a Redis event stream, so every decompose, handoff, and verify lands on screen the moment it happens. The same crew runs two ways: as live Claude Code or Codex sessions you supervise from the command center, or headless against a graded benchmark.

The shape of the system:

  • One engine, eight loop shapes. The engine never changes (decompose, dispatch, verify). A loop is that engine plus a stop condition: Land stops when the goal verifies, Climb when a metric plateaus, Sweep when a backlog drains, Race when a judge picks a winner.
  • No fake progress. Workers author each edit with W&B Inference against the validator's failing cases and keep a change only if the built artifact scores higher. Agent Mail carries the handoffs, Beads tracks the dependency graph, and nothing advances on a timer.
  • It improves itself. The benchmark ports a BPE tokenizer to Rust, scored by an exact token-ID diff against tiktoken. Between versions, the improver reads the failing cases back from Weave and rewrites the planner skill, so accuracy climbs without anyone touching the swarm code.
  • Point it at your own repo. A task is just a goal, a workspace, and a checkable evaluator. Give Glassbox a repo and a test command and it clones into a disposable sandbox, finds the failing tests, and fixes them. Your source is never touched.
Outcome
Built in a weekend from an empty repo. The graded Climb took the Rust tokenizer from 0.17 to a perfect 1.00 token-ID match against tiktoken, then a separate Python library task from 0.52 to 1.00, with zero swarm code changed between the two. On the live side, a Sweep drained a four-file backlog in about eight minutes and tore itself down, and a Climb cut tokenizer latency from 269 to 141 ms.
W&B WeaveMulti-agentClaude CodeMCPtldrawNext.jsRust

03About

A little context.

20+ years in games. Started as an intern at Sony Santa Monica and rose to Lead Level Designer over a decade, shipping God of War 1, 2, and 3 along the way. Was one of four hand-picked leads on a new unannounced AAA IP, managing 15+ designers.

In 2014 I founded White Elk Studios and recruited a small team of God of War franchise veterans to build Eclipse: Edge of Light, a VR adventure that won three Mobile VR Game of the Year awards (UploadVR, Daydream District, VR Fest 2018) and shipped to 8 platforms including PS VR, Oculus Quest, Steam, and Nintendo Switch.

In 2011 I founded GameDevDrinkUp, a monthly mixer that virally scaled from LA to 20+ cities worldwide, sponsored by Twitch.

Now solo-founding Aligned Tools. I take frontier model capabilities the week they ship and turn them into products non-engineers can use the same week.