BAISH Logo
BuenosAiresAISafetyHub
AboutProgramsResearchResourcesContact
EnglishEspañol
OverviewProblemSolutionAgentsPlanningDevelopmentResources
Home / Agentic Coding Workshop

Agentic Coding Workshop

Build projects with AI as if you had a full team: BMAD methodology, agents, processes, and tools ready to use.

Friday, October 3, 2025 · 4:00 PM · Rooms 1109 & 1110 — BAISH x Y-hat

This page brings together every material and resource from the workshop.

Join the "Agentic Coding" WhatsApp group inside the BAISH community to discuss ideas, share projects, and get help with BMAD.

Start nowBAISHY-hatJoin the WhatsAppSee resources

Context

The problem with traditional LLM workflows

Two obstacles make "give the model everything" fail: long-context degradation and the absence of real memory over time.

Long-context degradation

The LoCoDiff benchmark (January 2025) shows that even the best available model, Sonnet 4.5, degrades sharply with long contexts.

What happens: accuracy drops from 96% with 2K–21K-token contexts to 64% beyond 60K tokens. When you hand the whole codebase to the model, it drowns in noise.
  • •State-of-the-art models still rely on manageable context windows to keep high accuracy.
  • •Feeding entire files or repositories produces noise, repetition, and inconsistent decisions.

Source: LoCoDiff Benchmark

No long-term memory

A METR study (July 2025) found that experienced developers working on their own projects were 20% slower using AI than working alone.

Why? The LLM restarts from scratch every time: it lacks the tacit context that humans accumulate. Experts predicted they would be 39% faster, but reality was the opposite.
  • •Observed time per story: 1.67 h without AI vs. 2.26 h with AI assistance.
  • •Expectations forecasted 24%–39% faster delivery, exposing the gap between predictions and reality.

Source: METR AI R&D Study (July 2025)

BMAD method

The solution: an agentic team that plans and executes

BMAD is an AI development method where eight specialized agents work like a real team. They plan the whole project before coding and then execute story by story with continuous validation.

Each agent is the same LLM with a specialized prompt and verified, sharded context. No magic memory—just structured documentation within reach.

Sharding

Large documents are split into focused shards. Instead of handing the Developer a 10,000-token PRD, they receive a 400–600-token shard with only what matters for the story.

Result: every agent operates in the 96% accuracy zone (<5K tokens) without extra noise.

Specialized agents

Eight agents with concrete roles: Product Manager, Architect, Developer, QA, etc. Each has tailored instructions and checklists, and only queries the context they need.

Result: surgically precise context for every decision without overloading the model.

Structured documentation

Rigorous planning before coding: complete PRD, defined architecture, and sequential stories that validate each other. That creates the "memory" LLMs don't have.

Result: agents work with professional, verifiable information always aligned with the project goal.

Why BMAD works

1. Optimized context

In the METR study, teams worked with huge codebases in Cursor and ended up 20% slower. BMAD does the opposite: PM handles 2K tokens of requirements, Architecture ~3K tokens, and the Developer a single epic (1.5K tokens). High accuracy, no degradation.

2. Validation and transparency

The Product Owner validates every story draft before development, QA reviews the Developer's work, and humans approve each milestone. Artifacts stay human-readable and versioned in the repo.

3. Advanced elicitation

Agents ask questions and iterate on documentation. It's not just generating code—they discover, clarify, and update requirements while keeping the person in the loop.

Agentic team

The eight BMAD agents

Each role is the same model with a specialized prompt and sharded context. The outcome: a multidisciplinary team that works like a human crew.

Business Analyst

Insightful Analyst & Strategic Ideation Partner

/analyst *create-project-brief

Market research, brainstorming, competitive analysis, project briefs, and brownfield documentation.

Artifacts

  • ●Project brief
  • ●Market research
  • ●Competitor analysis
  • ●Brainstorming output

Product Manager

Investigative Product Strategist & Market-Savvy PM

/pm *create-prd

Creates PRDs, defines product strategy, prioritizes features, and keeps stakeholders aligned. Mandatory step in the method.

Artifacts

  • ●PRD (Product Requirements Document)
  • ●Brownfield PRD

Architect

Holistic System Architect & Full-Stack Technical Leader

/architect *create-full-stack-architecture

System design, technical architecture, tech selection, API contracts, and infrastructure plans. Mandatory step in the method.

Artifacts

  • ●Full-stack architecture
  • ●Backend architecture
  • ●Frontend architecture
  • ●Brownfield architecture

UX Expert

User Experience Designer & UI Specialist

/ux-expert *create-front-end-spec

UI/UX design, wireframes, prototypes, front-end specifications, and ready-to-use prompts for v0 or Lovable.

Artifacts

  • ●Front-end spec
  • ●UI design prompts

Product Owner

Technical Product Owner & Process Steward

/po *shard-doc

Manages the backlog, refines stories, defines acceptance criteria, and keeps artifacts coherent. Critical: sharding prevents model degradation.

Artifacts

  • ●Sharded documents
  • ●Epic files
  • ●Story validations

Scrum Master

Technical Scrum Master & Story Preparation Specialist

/sm *draft

Prepares clear stories, manages epics, runs retros, and keeps the agile cadence. Produces drafts the Developer can implement without friction.

Artifacts

  • ●User story drafts
  • ●Sequential tasks
  • ●Acceptance criteria

Developer

Expert Senior Software Engineer & Implementation Specialist

/dev *develop-story

Implements production code, covers tests, refactors, and documents decisions. Works story by story with test coverage.

Artifacts

  • ●Production code
  • ●Unit tests
  • ●Integration tests
  • ●E2E tests
  • ●Code documentation

QA

Test Architect with Quality Advisory Authority

/qa *review

Defines testing strategies, profiles risks, ensures requirement traceability, and issues the final quality decision.

Artifacts

  • ●QA results & gate decisions
  • ●Risk profiles
  • ●Test plans & design
  • ●Requirements tracing
  • ●NFR assessments

Operational flow

How the commands work

Agents are invoked from the terminal or IDE with short commands. You always pass context via @filename.md.

Command syntax

Pick your tool and follow the same structure: command + action + relevant files.

Claude Code / OpenCode

/pm *create-prd here's some rough notes @notes.md/architect *create-full-stack-architecture @prd.md/dev *develop-story @docs/stories/1.1.md

Cursor / Windsurf

@PM Create PRD, here's some notes @notes.md@architect Design the system architecture, here's the @prd.md@dev *develop @docs/stories/1.1.md

Switch agents without noise

Each agent runs in its own context. To swap cleanly:

  1. 1Clear the context with /clear (Claude Code) or /new (OpenCode).
  2. 2Invoke the agent with its init command (e.g., /po *shard-doc).
  3. 3Pass only the relevant files using @filename.md.

Tip: Keep your documents in docs/ and use clear filenames (e.g., docs/epics/epic-2-dashboard.md) so you shard context precisely.

Phase 1

Planning (one-time upfront)

Complete design before writing code. Everything runs from the terminal with BMAD commands.

1

PM: PRD Creation

The Product Manager creates the Product Requirements Document, guiding the user section by section.

  • •Interactive process: the PM asks questions and records answers on the fly.
  • •Defines features and epics (only story titles at this stage).
  • •Captures functional and non-functional requirements.
  • •Prioritizes MVP vs. roadmap and logs key dependencies.
  • •Each section is reviewed and approved before moving on.
/pm *create-prd
Important noteStories in the PRD stay brief (titles only). Later the Scrum Master expands them with detailed tasks.
2

Architect: System Design

The Architect reads the PRD and designs the full technical architecture, always in an interactive flow.

  • •Chooses tech stack for frontend, backend, and database.
  • •Defines folder structure and code organization.
  • •Designs APIs and contracts between components.
  • •Covers scalability, security, observability, and performance.
  • •Each section is validated with the user before advancing.
/architect *create-full-stack-architecture
3

PO: Master Checklist

The Product Owner ensures PRD and architecture are perfectly aligned before sharding.

  • •Checks coherence between requirements and technical design.
  • •Confirms the architecture covers every prioritized feature.
  • •Identifies gaps, contradictions, or risks that must be resolved first.
/po *execute-checklist-po
If something doesn’t alignDocuments keep iterating until everything makes sense. No moving forward without full validation.
4

PO: Sharding

The Product Owner shards the PRD and architecture into manageable epics and stories (<2K tokens).

  • •Runs the terminal program that splits the documents automatically.
  • •Produces small shards inside docs/epics and docs/stories.
  • •Each file holds focused context and is saved as versionable Markdown.
  • •Example: a large PRD becomes docs/epics/epic-1-auth.md, docs/epics/epic-2-dashboard.md, etc.
/po *shard-doc docs/prd.md/po *shard-doc docs/architecture.md
OutcomePrioritized backlog with sharded stories ready for iterative development.

Optional agents when needed

  • •Business Analyst: greenfield research and initial project brief.
  • •UX Expert: interface specs and prompts for tools like v0 or Lovable.
  • •QA: profiles risks and quality criteria before development starts.

When planning ends you have:

  • ✓Full PRD, reviewed and approved.
  • ✓Architecture defined and aligned with the PRD.
  • ✓Sharded, prioritized story backlog.
  • ✓Structured, versioned context for the entire team.

You haven’t written a single line of code yet. That intentional approach prevents weeks of refactors later on.

Phase 2

Iterative development loop

Story-by-story implementation with built-in validation. Repeat this loop until the backlog is done.

1

SM: Review previous notes

The Scrum Master reviews notes from the last story to start with accumulated learning.

  • •Identifies what worked and what didn’t.
  • •Recovers important technical decisions.
  • •Reviews feedback from Developer and QA.
  • •Documents takeaways for the next story.
/sm *review-notes @docs/stories/1.0.md
Learning loopToday’s notes feed tomorrow’s draft. The team improves story after story.
2

SM: Draft next story

The Scrum Master drafts the next story using only the relevant context.

  • •Reads only the matching epic (not the whole PRD).
  • •Reviews architecture and PRD just enough.
  • •Produces very clear sequential tasks.
  • •Defines specific acceptance criteria.
  • •Includes enough context without drowning the model.
/sm *draft story 1.1
3

PO: Validate story draft

The Product Owner validates the draft against the PRD before coding begins.

  • •Checks that the draft aligns with the original objectives.
  • •Confirms every requirement is covered.
  • •Flags gaps or contradictions for immediate correction.
  • •Can add recommendations for the Scrum Master.
/po *validate story @docs/stories/1.1.md
Alignment checkEnsures the Developer implements exactly what was agreed during planning.
4

Dev: Implementation

The Developer implements the story following architecture, UX, and the testing checklist.

  • •Ships production-ready code aligned with the architecture.
  • •Writes unit, integration, and E2E tests as needed.
  • •Covers error handling and edge cases.
  • •Can invoke MCPs like Playwright to automate end-to-end tests.
/dev *develop-story @docs/stories/1.1.md
5

QA: Test story thoroughly

QA acts as the quality gatekeeper before the story advances.

  • •Runs all tests (unit, integration, E2E).
  • •Performs manual testing across user flows and edge cases.
  • •Verifies acceptance criteria are met.
  • •Profiles risks (security, performance, reliability).
  • •Issues verdict: PASS (all good) / CONCERNS (review) / FAIL (critical) / WAIVED (explicitly accepted).
/qa *review @docs/stories/1.1.md
Quality gatesPASS (all good) • CONCERNS (minor flags) • FAIL (critical issue) • WAIVED (explicitly accepted risk).
Critical checkQA ensures the code works and meets professional standards before moving forward.
6

Dev: Fix according to QA review

The Developer addresses the QA report and brings everything to PASS status.

  • •Reviews the QA outcome (PASS/CONCERNS/FAIL).
  • •Implements targeted fixes and improvements.
  • •Resolves every CONCERN and FAIL logged.
  • •Reruns tests to confirm nothing regresses.
/dev Fix the issues from QA review @docs/stories/1.1.md
IterativeIf QA issues a FAIL, the review repeats until the result is PASS or WAIVED.
7

Mark done & next story

Close the story and loop back to step one with the next priority.

  • •Update the backlog and mark the story as done.
  • •Pick the next story from the prioritized backlog.
  • •Return to step one: SM reviews the notes that were just written.
Iterative loopStory after story, commit after commit, until the project is complete.

Repeat the loop: go back to step one with the next story.

Process notes

  • •Integrated validation: PO signs off before coding, QA tests afterward, Developer fixes and documents.
  • •Quality flexibility: accept CONCERNS for an MVP or require PASS for critical software.
  • •Continuous learning loop: Developer notes feed the SM in the next iteration.

Benefits of the iterative loop

  • ✓You always finish each story with a functional, tested version.
  • ✓You can pivot quickly: every cycle delivers independent value.
  • ✓Learning accumulates through notes and short retros.
  • ✓The person stays in control, approving every key transition.

Resources

Everything to start today

Tools, templates, and guides to run BMAD without friction.

Which tool should you use?

Claude Code

Best option: Anthropic's official tool with Sonnet 4.5 (top of the benchmark).

  • ✓Highest quality
  • ✓Seamless integration
  • ✓$100/mo (Max Plan)

OpenCode

Most versatile: sign in with your provider and it uses that plan.

  • ✓Login with Anthropic, GitHub Copilot, Z.ai, etc.
  • ✓Uses your provider’s plan
  • ✓Open source and multi-model

Requires a workaround (details below).

Cursor / Windsurf

Popular IDEs that support BMAD with @mention syntax.

  • ✓Full-featured editors
  • ✓Large community
  • ✓@agent syntax

Gemini CLI

Free: Gemini 2.5 Pro with official BMAD support.

  • ✓Gemini 2.5 Pro
  • ✓Free included usage
  • ✓Official BMAD support

Droid CLI

Free: GPT-5-Codex via Factory AI.

  • ✓GPT-5-Codex model
  • ✓100% free
  • ✓Factory AI platform

Requires a workaround (details below).

Workshop repo — start now

Ready-to-go repository with BMAD preinstalled, setup scripts, and configured MCPs.

  • ✓BMAD preinstalled and configured
  • ✓Setup script for Droid CLI (free)
  • ✓Includes MCPs (Sequential Thinking, Playwright)
  • ✓BMAD commands preloaded
  • ✓Example workflows
github.com/baisharg/Workshop-Vibe-Coding↗

Quick start

git clone https://github.com/baisharg/Workshop-Vibe-Codingcd Workshop-Vibe-Coding./setup.sh

The script installs Droid CLI (free, GPT-5-Codex) and configures all BMAD commands automatically.

Repositories and learning

BMAD Repository

Complete repo with prompts, documentation, and examples.

  • ✓Prompts for all eight agents
  • ✓Automated setup
  • ✓Full documentation
  • ✓Project examples
github.com/bmad-code-org/bmad-method↗

BMAD Method Masterclass

Video tutorial that walks the method end to end.

  • ✓Step-by-step setup
  • ✓How to use each agent
  • ✓Full workflow
  • ✓Live examples
youtu.be/LorEJPrALcg↗

Recommended plans

Z.ai Coding Plan

Recommended: affordable choice for students and makers.

  • ✓Only $3/mo
  • ✓GLM 4.6 model
  • ✓Performance close to Sonnet 4
  • ✓Ideal for students
z.ai/subscribe↗

Claude Code + Max Plan

For enthusiasts: highest quality for agentic coding.

  • ✓Anthropic's official Terminal Agent
  • ✓Access to Sonnet 4.5
  • ✓Max Plan: $100/mo with higher limits
  • ✓Optimized agent experience
claude.com/product/claude-code↗

GitHub Student Pack

Students get Copilot Pro for free.

  • ✓Copilot Pro included
  • ✓Access to Sonnet 4.5
  • ✓Top model on the benchmark
  • ✓No cost for students
education.github.com/pack↗

CLI / Terminal tools

OpenCode

Open-source terminal program. Sign in with your provider and OpenCode uses that plan.

  • ✓Login with Anthropic, Copilot, Z.ai, etc.
  • ✓Runs in any IDE terminal
  • ✓MCP server compatible
  • ✓Multi-model and open source
opencode.ai↗

Installation:

curl -fsSL https://opencode.ai/install | bash

BMAD configuration:

  1. Install BMAD and pick the “Claude Code” option.
  2. Rename the .claude/ directory to .opencode/.
  3. Move the .md files to the top level (not inside agents/ or tasks/).

Needs a small workaround to reuse the prompts.

Droid CLI

Factory AI client that exposes GPT-5-Codex for free, fully compatible with BMAD.

  • ✓GPT-5-Codex model
  • ✓Completely free
  • ✓Agent-oriented terminal program
docs.factory.ai/cli↗

BMAD configuration:

  1. Install BMAD and pick the “Claude Code” option.
  2. Rename .claude/ to .factory/.
  3. Move the .md files to the top level of .factory/.

Same workaround as OpenCode to reuse the prompts.

Installation & setup

Install BMAD

BMAD installs per project at the repo root, keeping everything versioned.

At your project root:

npx bmad-method install
  • ✓Creates the .bmad-core/ folder with agents and templates.
  • ✓Per-project install (not global).
  • ✓Everything stays under version control.

Recommended MCP tools

Tools that expand what agents can do.

  • ✓Playwright: browser automation for E2E testing.
  • ✓Sequential Thinking: structured reasoning.
  • ✓Explore more MCPs in the Smithery repository.
smithery.ai — MCP tools repository↗

Configure MCPs in your IDE and agents will use them when needed.

BAISH Logo

Buenos Aires AI Safety Hub

© 2025 BAISH. All rights reserved.

AboutProgramsResearchResourcesContact
Privacy Policy