← CatalogueFrom Copilot to Colleague · WorkshopAssess →Workshop in a box · ~90 minutes · 4–12 people
The AI-Native Team Workshop
A facilitated session for an engineering leadership team to locate itself honestly on the path to an AI-native operating model — and leave with the one structural decision worth making next. Print this page (it formats cleanly) and run it as-is.
00 · Warm-up (10 min)
Go around the room. Each person finishes the sentence: “We use AI, but we are not yet AI-native, because…” Capture the answers where everyone can see them. The patterns in this list are your real agenda.
01 · The five dimensions (10 min)
AI-native is a destination, not a toggle. Progress shows up across five dimensions. The facilitator reads each aloud; the team is only orienting here, not scoring yet.
- 01Delegation Depth. How far the organization has moved from inline suggestion to genuinely delegated execution.
- 02Eval Maturity. Whether output quality is governed by measurement loops rather than spot-checks and vibes.
- 03Context Infrastructure. Whether retrieval, memory, and institutional knowledge are treated as first-class infrastructure.
- 04Runtime Governance. Durable runtimes, scoped authority, audit trails, and a human control plane tuned to task stakes.
- 05Org Design. Whether the operating model itself — review, alignment, ownership — has been redesigned around cheap execution.
02 · Dimension by dimension (~10 min each)
For each dimension: a quick gut-check (early / developing / advanced), then the three prompts. The goal is not consensus on a number — it is surfacing the one honest gap.
01
Delegation Depth
- 1.Name one workflow where we still assume work moves at human speed. What would it take to close it?
- 2.Where do we use AI as inline autocomplete when we could be delegating a whole unit of work?
- 3.Who outside engineering could start real work with agents if we let them?
Source: Ch 1 — The Shift: From Assistant to Delegate →02
Eval Maturity
- 1.What do our dashboards actually count — activity (PRs, lines) or outcomes (rework, unreverted ships)?
- 2.If an agent silently got worse this month, how would we find out, and how long would it take?
- 3.Which one high-stakes output deserves an automated eval on the path to production first?
Source: Ch 4 — Evals Are the Control System →03
Context Infrastructure
- 1.What institutional knowledge still lives only in senior people’s heads?
- 2.Are we improving retrieval quality, or just giving agents more raw data?
- 3.Could we trace what context produced a given agent output if we had to?
Source: Ch 5 — Context Is Infrastructure →04
Runtime Governance
- 1.Do our agents run with scoped, time-limited credentials — or standing human-inherited access?
- 2.For our riskiest agent action, where exactly does a human re-enter, and is that calibrated to the stakes?
- 3.What is the lightest governance that would earn us the trust to move this pilot to production?
Source: Ch 6–7 — Runtimes & High-Stakes Trust →05
Org Design
- 1.Is review capacity resourced as the throughput limit of the org, or treated as a QA afterthought?
- 2.Where do two people’s agent work first meet — in a shared plan, or at the merge?
- 3.Who owns the judgment layer: deciding which of the many cheap artifacts is worth shipping?
Source: Ch 9 — The AI-Native Organization →03 · The one decision (15 min)
Phase transitions are governance events, not capability events — an org moves up a level because someone made a structural decision, not because the models got better. Pick the single weakest dimension and name one structural change you will own this quarter: a path hardened, a convention written down, a review system built, a credential scope designed. Assign it. That is the workshop’s only deliverable.
Want a per-person profile to anchor the discussion? Have everyone take the 10-minute assessment before the session.