gstack: Garry Tan Turns Claude Code Into a Full AI Engineering Team

gstack is Y Combinator CEO Garry Tan's open-source toolkit that turns Claude Code into a virtual team of 23 specialists. Slash commands for review, QA, deploy, and parallel sprints.

gstack: Garry Tan Turns Claude Code Into a Full AI Engineering Team

What gstack Is and Why It Matters

gstack is an open-source set of 23 slash commands for Claude Code, released by Garry Tan, CEO of Y Combinator. It turns a single AI agent into a virtual engineering team โ€” CEO, designer, engineering manager, QA lead, security officer, and release engineer. MIT licensed, the project has already crossed 56,000 GitHub stars within days of release.

The motivation is straightforward. Tan cites Andrej Karpathy: "I haven't typed a line of code since December." Peter Steinberger built OpenClaw โ€” 247K stars โ€” essentially solo with AI agents. Tan took that concept and wrapped it in a repeatable, structured process.

The claimed numbers: 600,000+ lines of production code in 60 days, 35% tests, part-time โ€” while running Y Combinator full-time. In one week: 140,751 lines added across 362 commits. Take those numbers with appropriate skepticism, but the repository is public and auditable.

How It Works: The Sprint Cycle

gstack is not a random collection of prompts. It follows a structured sprint cycle: Think โ†’ Plan โ†’ Build โ†’ Review โ†’ Test โ†’ Ship โ†’ Reflect.

Each command feeds the next:

  • /office-hours โ€” brainstorming session where the agent challenges project premises, extracts hidden requirements, and writes a design doc
  • /plan-ceo-review โ€” 10-section strategic review of the generated design doc
  • /plan-eng-review โ€” ASCII diagrams for data flow, state machines, test matrix, and error paths
  • /review โ€” automated code review with auto-fixes and flagged items requiring approval
  • /qa โ€” real browser QA via Playwright, finds bugs, fixes them, and generates regression tests
  • /ship โ€” bootstraps test frameworks if missing, runs coverage audit, creates PR
  • /retro โ€” retrospective with metrics on lines, commits, and net LOC
The standout is /qa. It's not a generic headless test runner โ€” it opens a real browser, navigates the application, spots visual and functional issues, fixes them, and writes regression tests. Tan calls it "the command that let me go from 6 to 12 parallel workers."

Parallel Sprints and Advanced Tools

gstack becomes powerful when paired with Conductor, an orchestrator that runs multiple Claude Code sessions in parallel. Tan reports regularly running 10โ€“15 concurrent sprints: one on /office-hours, another on /review, a third on /qa, the rest spread across different branches.

Other notable commands:

  • /design-consultation โ€” builds a design system from scratch, writes DESIGN.md
  • /design-shotgun โ€” generates multiple visual variants for side-by-side comparison
  • /codex โ€” gets an independent second opinion from OpenAI's Codex CLI, with cross-model analysis
  • /careful, /freeze, /guard โ€” safety guardrails: warnings before destructive commands, directory locking
  • /cso โ€” OWASP + STRIDE security audit
  • $B connect โ€” real browser mode with headed Chrome controlled by Playwright in real time
The stack is TypeScript (71%), Go Template (22%), Shell (5%). Requirements: Claude Code, Git, and Bun. It also works with Codex, Gemini CLI, and Cursor via the SKILL.md standard.

Two observations. First: gstack is opinionated. It enforces a specific workflow, which is both its strength and its constraint. Second: claimed productivity numbers measure lines of code โ€” a notoriously misleading metric โ€” but the process structure is solid regardless of the numbers.

FAQ

Does gstack only work with Claude Code? No. It also supports Codex, Gemini CLI, Cursor, and Factory Droid through the SKILL.md standard.

Do I need a paid subscription? gstack itself is free and MIT licensed. However, you need an active Claude Code account, which has its own cost.

Can I use individual commands without adopting the full workflow? Yes, each slash command works independently. But the most value comes from using them in sequence, since each command reads the output of the previous one.

Need a consultation?

I help companies and startups build software, automate workflows, and integrate AI. Let's talk.

Get in touch
โ† Back to blog