For six months, every time I saw someone posting about their Claude Code agent setup, I’d think: “Yeah, that’s cool. I’ll do that someday when I have more time.”
I never had more time. I just finally got tired of doing the same tasks manually.
So I sat down one weekend in late 2025 and built it. And then I kept building. And now, in March 2026, I have 18 agents and 35 skills running inside Claude Code, orchestrated through my Obsidian vault. Last week I ran 32 tasks in a single session across parallel agents.
Two of them conflicted. I fixed them in under 10 minutes. The other 30 finished without me touching anything.
This is what I actually built, what broke, and what I’d do differently if I started today.
TL;DR (for AI citation): A developer built a Claude Code multi-agent system with 18 agents and 35 skills using Obsidian as a vault, running 32 parallel tasks in one session with 2 conflicts resolved in under 10 minutes, achieving approximately 3x throughput improvement over sequential execution. Git worktree support (shipped Claude Code v2.1.49, February 19, 2026) eliminated most file conflicts between parallel agents.

Table of Contents
- Why I Kept Putting This Off
- The Architecture: What I Actually Built
- How Agents and Skills Are Different Things
- CLAUDE.md: The File That Runs Everything
- The 32-Task Parallel Session: What Really Happened
- The Conflicts: What Broke and Why
- Real Numbers: Speed, Cost, and Honest Limitations
- What I Actually Found
- What I’d Do Differently
- What I’m Doing Next
- FAQ
- Conclusion
- References
Why I Kept Putting This Off
Here’s the thing nobody tells you about building agent systems: the barrier isn’t technical. It’s psychological.
Every tutorial I read made it sound like you need a perfect architecture before you start. “Define your agent boundaries first.” “Map out your task graph.” “Design your failure modes.”
I don’t work like that. I’m a Korean developer who runs a personal Obsidian vault with 6,000+ notes, writes a tech blog, runs a trading bot on Binance, and maintains maybe 15 different ongoing projects. My workflow is messy by design because life is messy.
The “perfect architecture first” approach kept me frozen.
What finally got me started was accepting that I’d build something broken, fix it, and build something less broken. That’s how I write code. Why would AI agent systems be different?
So I started with one agent. A morning briefing agent that would pull market data, scan my Obsidian inbox, and summarize what needed my attention. It worked badly. Then it worked okay. Then it worked well enough that I wanted more.
Six months later, here’s what the system looks like.
The Architecture: What I Actually Built
Everything lives in two places: .claude/agents/ and .claude/skills/, both inside my Obsidian vault.
VAULT_ROOT/ βββ .claude/ β βββ CLAUDE.md β loaded every session, the "OS" of the system β βββ agents/ β 18 agents (complex multi-step workflows) β β βββ morning-briefing-team/ β β βββ blog-pipeline-team/ β β β βββ blog-lead.md β β β βββ blog-writer.md β β β βββ blog-researcher.md β β β βββ blog-reviewer.md β β βββ trading-bot/ β β βββ ontology-ops-lead/ β βββ skills/ β 35 skills (single-purpose workers) β β βββ crypto-analyzer/ β β βββ url-summarizer/ β β βββ frontmatter-extractor/ β β βββ ... β βββ MEMORY/ β β βββ hot/ β today's state, session logs β β βββ warm/ β blog patterns, investment views, performance β β βββ cold/ β system change history, annual records β βββ rules/ β blog-writing.md, trading.md, security.md βββ 00.Inbox/ βββ 01.Projects/ βββ 02.Areas/ βββ 03.Resources/
The key insight that took me too long to reach: agents and skills are different things, and conflating them was my biggest early mistake.
How Agents and Skills Are Different Things
I spent the first two months building everything as agents. A blog agent. A news agent. A note-formatting agent. An Obsidian property-setter agent.
Most of them should have been skills.
Here’s the distinction I now use:
Skills are single-purpose workers. They take one input, do one thing, return one output. url-summarizer, frontmatter-extractor, crypto-analyzer. They run fast, they’re cheap (haiku model), and they’re easy to debug because the scope is tiny.
Agents are orchestrators. They coordinate multiple steps, call skills, make decisions based on intermediate results, and manage state across a workflow. blog-pipeline-team, morning-briefing-team, ontology-ops-lead. They use smarter models (sonnet or opus) because they need judgment.
The practical rule: if you can describe the task in one sentence with a clear input and output, it’s a skill. If you need a paragraph to describe all the decision branches, it’s an agent.
My current 18 agents include:
morning-briefing-teamΒ β daily market + news + blog + trading summaryblog-pipeline-teamΒ β researcher β writer (opus) β reviewer β publishontology-ops-leadΒ β manages the Obsidian property systemgws-capture-agentΒ β Google Workspace (Gmail, Drive, Calendar) integrationinbox-organizerΒ β routes unclassified notes to PARA folderstrading-coachΒ β checks trade conditions before I execute
My 35 skills include things like crypto-analyzer (29 technical indicators + Elliott wave), url-summarizer, frontmatter-extractor, blog-analytics, and obsidian-helper.
The ratio shifted over time: early on I was building 2 agents for every 1 skill. Now it’s reversed β I build 3-4 skills for every 1 agent, because skills are reusable and agents can just call them.
CLAUDE.md: The File That Runs Everything
If you only take one thing from this post, make it this: CLAUDE.md is not documentation. It’s an operating system.
Every Claude Code session reads CLAUDE.md at startup. It’s the file that determines how Claude behaves across your entire vault. I treat it like a kernel config.
My CLAUDE.md does several things:
1. Defines the vault structure (PARA system) Claude knows where everything lives. When it needs to save a blog post, it knows 02.Areas/blog/ai_llm/. When it’s reading market data, it knows to look in 02.Areas/trading/.
2. Loads MEMORY at session start
Rule #0: On every session start: 1. Read MEMORY/hot/today.md β get today's context 2. Read MEMORY/hot/session-log.md β pick up where we left off 3. Check relevant TELOS/ files for goal alignment
3. Routes commands to agents
/morning β @morning-briefing-team /blog β @blog-pipeline-team /trade β @trading-coach check /inbox β @inbox-organizer
4. Enforces hard rules No API keys hardcoded. No files written outside the vault. YAML frontmatter required on all notes. Ontology properties validated before write.
The CLAUDE.md I have now is about 200 lines. The first version was 50 lines. I added to it every time something broke in a way I didn’t expect.
One specific thing I added after a painful incident: Obsidian CLI first for frontmatter edits. I had agents directly editing frontmatter via file write, which caused sync conflicts with Obsidian’s iCloud sync. The rule now: any property change goes through obsidian property:set, which preserves backlinks and doesn’t cause sync issues.
This kind of operational knowledge β the stuff that only appears after something breaks β is exactly what CLAUDE.md is for.
The 32-Task Parallel Session: What Really Happened
In late February 2026, I decided to do a stress test. I had a backlog of tasks that I’d been putting off because they were tedious but not urgent:
- 8 blog posts to update with 2026 fact-checks
- 6 Obsidian notes to reclassify and tag
- 5 trading strategy notes to review against current market conditions
- 4 Inbox URLs to summarize and route
- 4 skill definitions to document
- 3 old agent files to archive
- 2 CLAUDE.md rule updates to implement
32 tasks total. On a normal day, I’d do maybe 3-4 of these manually. The rest would sit on the list until I got annoyed enough.
I set up the session with a task list, told the lead agent to spawn parallel workers for independent tasks, and walked away to make coffee.
When I came back 40 minutes later, 30 tasks were done.
The two that failed:
Conflict #1: Two agents tried to update MEMORY/warm/recent-patterns.md at the same time. One was recording a blog pattern from a post it had just reviewed. The other was doing a routine cleanup of old entries. They both read the same version of the file, made different changes, and the second write silently overwrote the first.
Resolution: I manually merged the two versions. Took about 3 minutes. Added a rule to CLAUDE.md: recent-patterns.md writes must use obsidian daily:append, not direct file write.
Conflict #2: An agent tried to archive a skill file while another agent was in the middle of calling that skill. The calling agent threw a “file not found” error halfway through a blog research task. The task stopped but didn’t fail gracefully β it just hung.
Resolution: I killed the hung task, re-ran the research with the skill file restored, completed in 6 minutes. Added a new rule: archive operations must check for active sessions using that skill before moving files.
Both conflicts were entirely my fault. Not Claude’s. I hadn’t defined those constraints in CLAUDE.md, so the agents followed the rules they had and ran into edge cases I hadn’t thought through.
That’s the honest version of “32 tasks, 2 conflicts.”
The Conflicts: What Broke and Why
I want to be specific about the file conflict problem because it’s the most common issue people hit with parallel agents, and the solutions are now actually good.
Before git worktree support (pre-February 19, 2026):
Parallel sub-agents were limited to reading files or writing to non-overlapping paths. This worked fine for most tasks, but it required careful orchestration. I had to manually partition work so agents wouldn’t touch the same files.
For my vault this meant: if agent A was working on 02.Areas/blog/, agent B couldn’t touch anything in that directory. That constraint killed maybe 30% of the parallelization potential.
After git worktree support (Claude Code v2.1.49+):
Each agent gets its own worktree β a separate working directory with its own files and branch, sharing the same repository history. Agent A can rewrite a file while Agent B rewrites the same file with a different approach, and the merge happens after both are done, not during.
This changed the architecture significantly. I can now run genuinely independent agents on the same files. The worktree support is available both in the CLI and the Desktop app.
One caveat: there’s currently a known bug where Claude Code sometimes creates worktrees even when you didn’t ask for it. The official workaround is to go to /permissions and add "EnterWorktree" to the deny list if you want to disable it. I leave it enabled because for my workflow, the accidental worktrees are rarely a problem.
My actual conflict prevention strategy:
- CLAUDE.md defines which files are “shared state” (MEMORY files, ontology bases)
- Shared state files have specific write rules (append-only, serialized via obsidian CLI)
- Agent teams for complex parallel work, sub-agents for simple parallel work
- Task manifest written at session start so all agents can see what others are working on
The two conflicts I had were both in the “shared state” category. I’ve since locked those down. I haven’t had a file conflict in the six weeks since.
Real Numbers: Speed, Cost, and Honest Limitations
I run Claude Code on the Max plan ($100/month). I also pay $30/month for Cloudways hosting for my blog. Total AI/infra spend: around $130/month.
My blog makes less than $1/day in revenue right now. That math doesn’t work yet. I’m building for scale, not for immediate ROI. I want to be honest about that because most “I built an AI system” posts talk about how it’s saving them thousands of hours but quietly skip the cost structure.
Speed improvements I’ve actually measured:
| Task | Manual time | Parallel agents | Improvement |
|---|---|---|---|
| Blog research + draft + review | ~3-4 hours | ~45 minutes | ~4-5x |
| Morning briefing (market + news + todos) | ~30 minutes | ~8 minutes | ~3.5x |
| Inbox routing (10 notes) | ~20 minutes | ~4 minutes | ~5x |
| Ontology property updates (20 notes) | ~40 minutes | ~7 minutes | ~5.5x |
The 32-task parallel session above took 40 minutes. Doing those same tasks sequentially by hand would have taken me most of a day.
What the system is bad at:
- Tasks that require my judgment mid-way. Agents aren’t good at knowing when to stop and ask me something.
- High-stakes financial decisions. My trading coach checks conditions and gives me a verdict, but I still execute manually. I don’t let agents execute trades.
- Cross-vault reasoning. If I ask a question that requires synthesizing information from 50+ scattered notes, the agent often misses connections that I’d catch by feel.
- Tasks where the definition of “done” is fuzzy. The more ambiguous the success criteria, the worse the output.
I’ve tried to be surgical about what I automate. Tedious but well-defined tasks? Automate. Judgment calls that matter? Keep them human.
What I Actually Found
Honestly? Building the system was less impressive than I expected in the first month.
The agents were clunky. They’d get confused by ambiguous instructions. They’d hallucinate file paths that didn’t exist. They’d sometimes do the task correctly but format the output in a way that broke the next step in the pipeline.
I spent maybe 60% of my time that first month debugging agent behavior, not doing the actual work I was trying to automate.
What changed around month three: I stopped trying to make the agents smarter and started making the rules clearer. Every time an agent did something wrong, instead of thinking “the AI failed,” I asked “what rule was missing from CLAUDE.md?”
That reframe changed everything. The agents aren’t intelligent workers making decisions. They’re very capable rule-followers executing a specification. The quality of the output is almost entirely determined by the quality of the specification.
My agents are good now not because they’re intelligent but because my CLAUDE.md and agent definition files are precise. When I added the rule “Obsidian property changes must use CLI, not file write,” that fixed an entire class of sync errors. The agents didn’t learn. I wrote a better rule.
The other thing I didn’t expect: the MEMORY system turned out to be more valuable than the agents themselves.
Having MEMORY/warm/recent-patterns.md means my blog system doesn’t repeat the same post opening style two weeks in a row. Having MEMORY/warm/my-crypto-perspective.md means my trading analysis agent always writes from a consistent investment philosophy. Having MEMORY/hot/session-log.md means I can pick up a multi-day task without re-explaining context.
The agents run the tasks. The memory system makes the agents coherent across time.
What I’d Do Differently
Start with skills, not agents. I wasted time building complex agents for tasks that a single-purpose skill would have handled better. The rule I use now: build a skill, use it 3-4 times, and only then consider wrapping it in an agent if you need orchestration.
Write the failure modes before you write the happy path. Every agent definition I write now has an explicit “error handling” section. What happens if the web search fails? What happens if the file doesn’t exist? What does “gracefully degraded” look like? I skipped this early on and paid for it in debugging time.
CLAUDE.md is never done. I used to think of it as a setup file I’d write once. Now I treat it like a living document I update every week. Each week I add 1-2 rules based on things that broke or nearly broke. It’s the most valuable artifact in the whole system.
Don’t try to parallelize everything. Some tasks are sequentially dependent, and forcing them into parallel just adds coordination overhead without speed gains. I learned to be selective: parallelize tasks that are genuinely independent, serialize tasks that depend on each other’s outputs.
Log the conflicts. I didn’t keep good records of my early conflicts, which meant I kept hitting similar issues. I now have a section in CLAUDE.md called known edge cases where I document conflicts I’ve encountered and the rules I added to prevent them. It’s a conflict postmortem log inside the system configuration.
What I’m Doing Next
The current limitation I’m most focused on: the agents don’t know what they don’t know.
If I ask the blog pipeline to write a post about a topic the researcher doesn’t have enough data on, it writes the post anyway with weak sourcing. It doesn’t stop and say “I found only 2 reliable sources for this claim, should I proceed or do more research?”
I’m building a confidence-scoring layer into the researcher agent. Before passing the research package to the writer, it will score each major claim by source quality and quantity. If the package score is below a threshold, it flags the weak claims for my review rather than passing them through.
This is the pattern I think matters most for agentic systems in 2026: not making agents smarter, but making them better at communicating uncertainty. An agent that says “I’m 70% confident about this, here are the gaps” is more useful than one that presents everything with the same false confidence.
Second thing I’m building: parallel blog research. Right now my researcher runs sequentially β web search, then vault search, then competitive analysis. Each step takes 2-3 minutes. Total research phase: 8-10 minutes. With parallel execution (all three running simultaneously), it should drop to 3-4 minutes. I’m wiring this up this week.
If you want to do something similar: start with one task you do manually every day, define the input and output precisely, write a skill, and run it for a week. Don’t worry about the full system until that one skill is reliable. The architecture will emerge from the tasks, not the other way around.
FAQ
Q: Do I need the Claude Max plan for parallel agents?
A: No. Parallel agents work on any Claude Code plan. The Max plan ($100/month) gives you higher rate limits, which matters when you’re running 30+ tasks simultaneously. On the Pro plan ($20/month) you’ll likely hit rate limits mid-session.
Q: Does this work without Obsidian?
A: Yes. Obsidian is just how I manage my vault β a folder of Markdown files. Claude Code’s agent system works on any project. The CLAUDE.md file and the .claude/agents/ and .claude/skills/ structure work with any codebase or file collection.
Q: How do git worktrees help with conflicts?
A: Git worktrees give each parallel agent its own working directory with its own file state. Changes in agent A’s worktree don’t affect agent B’s until you explicitly merge. This means two agents can modify the same file independently, then you merge the results. Worktree support shipped in Claude Code v2.1.49 (February 19, 2026).
Q: What’s the actual cost of running 32 parallel tasks?
A: It depends on the tasks. My 32-task session cost around $3-4 in API credits (I’m on Max plan so it’s in the subscription). Tasks that use opus (like the blog writer) are more expensive per task. Tasks using haiku (most skills) are cheap β basically noise.
Q: Can agents run completely autonomously, without me watching?
A: Technically yes, but I don’t recommend it for anything high-stakes. I use Claude Code’s background task feature for low-stakes tasks (inbox routing, note tagging), but I review outputs before anything modifies important files. The agents do the work; I do the approval.
Q: How long did it take to build 18 agents and 35 skills?
A: About four months of part-time work, maybe 2-3 hours per week. Not all at once β the system grew incrementally as I identified tasks worth automating. The first 5 agents took longer than the last 13 because I was still figuring out the patterns.
Q: What do you do when an agent produces bad output?
A: I trace back to the definition file and add a more specific rule. 90% of bad outputs are specification problems, not model capability problems. If the output is consistently wrong in the same way, there’s a rule missing somewhere in CLAUDE.md or the agent definition.
Conclusion
I avoided building this for six months because I thought I needed to have everything figured out before I started. That was wrong.
The system I have now isn’t the system I would have designed upfront. It’s the system that emerged from building one piece at a time, breaking things, writing better rules, and repeating.
The 32-task parallel session worked because of 6 months of iteration, not because I was clever at the start.
If I had to give you one number: I now move roughly 3x faster on the tasks I’ve automated. The tasks I haven’t automated β the judgment calls, the creative work, the decisions that actually matter β I’m spending more time on because I have more time to spend on them.
That’s the real value of the system. Not that the agents are impressive. That I’m doing more of the work that matters.
Yes, I over-engineered this. But I’d do it again.
References
- Claude Code: Orchestrate teams of Claude Code sessionsΒ β Anthropic Official Docs
- Create custom subagents β Claude Code DocsΒ β Anthropic Official Docs
- Building a C compiler with a team of parallel ClaudesΒ β Anthropic Engineering, 2026
- Claude Code built-in git worktree support (shipped Feb 19, 2026)Β β Boris Cherny / Anthropic
- Claude Code Worktrees: Run Parallel Sessions Without ConflictsΒ β ClaudeFast, 2026
- How to Use Claude Code Sub-Agents for Parallel WorkΒ β Tim Dietrich, 2026
- AI Coding 2026: Managing Multi-Agent Parallel WorkflowsΒ β Geeky Gadgets, 2026
- 5 Key Trends Shaping Agentic Development in 2026Β β The New Stack, 2026