Remember my Vibe Coding Essentials and Vibe Coding Advanced posts from last year? A lot changed since then, AI coding is now the norm and tools like OpenCode or Claude Code became industry standard. The core idea is still orchestrator models driving development, but the workflow is less “vibes” and more strict task orchestration.
I’ve refined this to work across different tools, but the biggest shift is how I structure responsibility. It is now a 4-persona system where I keep architecture and product decisions, and the agents handle execution loops.
The 4-Persona System
1. Architect: The Planner

Model: Usually Gemini 3 Pro, but any thinking model works
The Architect is the brain of the operation. I usually reach for Gemini 3 Pro here because of its ability to ingest massive context windows and its “thinking mode,” but honestly, any model with strong reasoning capabilities does the job. The key requirement is that it can reason through complex problems without getting distracted by implementation details.
I’ve set it up with access to WebSearch and Perplexity through my MCPNest service. This ensures that whether I’m using Codex, OpenCode, Gemini, or Zed, the toolset remains identical across the board.
The Workflow
The Architect’s sole purpose is to output a comprehensive implementation plan into a plans/xxx.md file (or GitHub Issue). The process is iterative and collaborative. I start by feeding the model my raw requirements and context. It then drafts a plan, which I immediately hand off to another high-reasoning model (like codex-xhigh or gemini-3-pro) for a second opinion.
I usually run multiple passes on the design doc with different thinking agents. One might catch edge cases in the data model, while another suggests a cleaner API contract. I don’t care how long the architect agent takes to “think”: latency is irrelevant because I’m usually running a few of these in the background while I’m doing other things. I layer my own feedback on top of their critiques, and we loop through this cycle a few times until the plan is bulletproof.
Only when the plan is solid do we move forward.
I moved the full Architect prompt and review-loop prompt into a gist to keep this post readable:
Architect prompt excerpt (I have another version of this that outputs into GitHub issues):
You are Architect Mode, an implementation planning subagent for OpenCode. You generate structured, actionable plans that junior engineers can execute without including full code or large diffs.
(...)
How you work
- Planning-only (no execution): You MUST NOT modify files, run commands, commit, or open PRs.
- Double-check intent: Even if the user asks for implementation, you still only produce a plan; you do not implement.
- Context gathering: Use `read`, `list`, `glob`, `grep` to understand repo layout and relevant files.
- Output only: Produce a single, concise, well-structured plan.
- Prefer descriptions over code: Reference identifiers and paths inline with backticks.
- If asked to implement: Clearly state you will not make changes, then provide the plan and suggest switching to an implementation agent.
Formatting rules
(...)
Required plan sections
- Summary
- File Changes
- Implementation Steps
- Contracts & Data
- Risks & Assumptions
- Validation
- Rollback
- Follow-ups
Quality bar
- List exact files to touch, or explain how to find them.
- Keep context minimal but sufficient for execution.
- No full implementation or diffs.
- State assumptions explicitly if repository context is incomplete.
Persistence rule
- Always write or update the plan in `plans/YYYYMM-<plan-name>.md`.
Safety
(...)
Example of a created plan (shortened):
# Add Stats Page
## Goal
Add a new `/stats` page to provide users with insights into their Japanese usage and common mistakes.
## Features
1. **Correction Type Distribution**: Donut/pie chart by correction type.
2. **Drill-down View**: Click a type to show recent mistakes.
## Implementation Plan
### 1. Backend: Update `Fixmyjp.Corrections`
- Add `get_correction_type_stats(user_id)`
- Add `list_recent_corrections_by_type(user_id, type, limit)`
### 2. Frontend: Create `FixmyjpWeb.StatsLive`
- Add `lib/fixmyjp_web/live/stats_live.ex`
- Use `FixmyjpWeb.Auth` for auth
- UI:
- Chart (CSS/SVG with Tailwind)
- Interactive legend
- Mistake list with `original` -> `corrected` and explanation
### 3. Routing
- Add `live("/stats", StatsLive, :index)` inside authenticated scope
### 4. Navigation
- Add Stats link in app layout
## Technical Details
### Database Queries
(...)
### UI Structure
(...)
2. Implementer: The Builder

Model: Usually not a very-high thinking model: codex-high, codex-medium, or Sonnet 4.5
Once the plan is set, the Implementer takes over. I explicitly avoid the heavy thinking models here (like Claude 4.5 Opus) because the hard thinking has already been done by the Architect. The Implementer doesn’t need to be a genius; it just needs to be obedient.
Its job is straightforward: follow the plans/xxx.md file. It writes the code, runs my verification steps (linting, formatting, testing, compiling), and handles the git choreography of creating branches, committing changes, and opening PRs.
I often offload this to cloud agents like Jules, Copilot or Codex, for example with a command like this:
cat plans/202601-deployment-last-used-tracking.md | jules remote new --repo dvcrn/mcpnest

The beauty is that the execution environment does not matter, local terminal or cloud container, as long as the plan is followed. Since all environments are configured identically, I can swap them out interchangeably. I can create a GitHub issue and assign it to Copilot, but I can also open a terminal on my server and give it to Claude Code.
3. Review Agents: The Critics

Models: Gemini, Codex, Copilot in the cloud
This is where I try to avoid the echo chamber. If a single model writes the code and reviews it, it’s likely to miss its own blind spots. That’s why I have Gemini and Codex automatically review every PR that gets created.
For heavier or more critical changes, I throw Copilot into the mix as well. This gives me 3 independent agents reviewing the code, drastically reducing the chance of model bias slipping through.


I also keep a review persona handy in OpenCode, which I can summon with a slash command whenever I need a quick sanity check on a specific snippet.
4. Final Touchups: The Resolver

Model: Codex (or whatever model I’m using for coding at that time)
Finally, we have the Resolver. This isn’t a single step but a loop. Codex reads the feedback from the Review Agents, asks me which points I want to address, and then handles them. Once the changes are made, it automatically requests another review from the critics. This loop continues until all critical issues are resolved and the code is polished.
For my local tools, I have this neatly encapsulated into commands or subagents, for example this slash command here that I can use in OpenCode, Claude Code, or Codex:
Use `gh` or available tools to fetch all GitHub pull request comments for the PR that merges the current branch into main or master.
Current repo:
!`git remote -v`
Current branch:
!`git branch -r --contains HEAD`
Current PRs:
!`gh pr list`
Analyze them, check the files they mentioned and output a list for each of them, explaining if this is relevant and should be fixed or not
Use the following command to get the comments:
```
gh api graphql -F owner='<OWNER>' -F name='<REPO>' -F number=<PR NUMBER> -f
query='query($name: String!, $owner: String!, $number: Int!) {
repository(owner: $owner, name: $name) { pullRequest(number: $number) {
reviewThreads(last: 100) { nodes { isResolved path comments(first: 1) {
nodes { body line } } } } } } }' --jq '.data.repository.pullRequest.
reviewThreads.nodes[] | select(.isResolved == false) | "File: \(.
path)\nComment: \(.comments.nodes[0].body)\n---"'
```
Then ask the user whether we should fix them.
After fixing, mark each fixed comment as resolved using this command:
```
gh api graphql -F threadId="PRRT_kwDOQDhXrc5oS98H" -f
query='mutation($threadId: ID!) { resolveReviewThread(input: {threadId:
$threadId}) { thread { isResolved } } }'
```
Once the code is committed and pushed, ask the user whether we should re-request a review by commenting "/gemini review @codex review"
Built for Parallelization
The real superpower of this system is not just code generation. It scales.
Because the Architect only outputs markdown files in the plans/ folder, I can have multiple Architect agents running in parallel without ever hitting a merge conflict. I often have half a dozen unimplemented plans sitting in my folder, iterating on them while the Implementer is busy building something else. It shifts the bottleneck from my own attention span to total system throughput. I’ve stopped optimizing for how fast an agent responds and started optimizing for how many high-quality plans I can have maturing simultaneously.
The plan is the contract. Once it’s written, it doesn’t matter which agent picks it up or when.
Agents Control Agents
I mentioned sub-agents in my past posts, and they are the glue that holds this together. My agents are empowered to call other agents. The Implementer can spawn a code-review sub-agent to check its own work before committing. It can call the commit agent to handle the git history. Each agent has it’s own responsibility and context, not all agents need to know everything and this is how I split knowledge.
It’s a hierarchy where agents review other agents, and corrections are implemented by yet other agents.
So what do I do?
I focus on the remaining 10%:
- Making high-level architecture decisions
- Reviewing the final code quality
- Providing specific constraints (e.g., “use this package,” “structure it this way”)
The agents handle the other 90%. It is not “AI writes all my code.” It is “AI handles boilerplate, tests, reviews, and iteration, while I focus on product and architecture.”
Where This Still Fails
This setup is fast, but it still breaks in predictable ways:
- Plan drift: Implementers sometimes follow the spirit of a plan, not the exact constraints.
- False confidence: Multiple model approvals can still miss the same blind spot.
- Review noise: Automated review threads can create churn if severity is not filtered.
- Cost and latency: Parallel agents are great until you look at spend and queue time.
- Over-planning risk: Architect mode can over-spec work that should have been a quick spike.
None of this is magic. It is just a tighter process with better delegation boundaries.
What Changed Since “Vibe Coding”
The core orchestrator approach is still the same, but the maturity of the system has leveled up:
- Clearer personas: Each model has a specific job, not just “the AI”.
- Better parallelization: Work is decoupled via Plans, avoiding conflicts.
- More automation: Review loops happen automatically without me babysitting.
- Tool parity: My MCP setup is identical across every tool I use.
- Cloud flexibility: Implementation is commoditized; I can use any cloud agent.
The result is that I can maintain 3-4 projects simultaneously, with agents handling the heavy lifting while I steer the ship.
Questions or want to share your AI coding setup? Let me know on Twitter/X.
