CLAUDE.md, Skills, AGENTS.md: The Three-Layer Architecture That Scales Across Tools
AGENTS.md is the open standard since 2026. Claude Code is catching up. The three-layer architecture for DACH engineering teams that works today and scales tomorrow.
Key numbers at a glance
- 3 layers in the modern coding-agent architecture in 2026: CLAUDE.md (project truth), Skills (modular best practices), Custom Commands (repeatable workflows).
- AGENTS.md has been the open standard since December 2025, stewarded by the Linux Foundation initiative Agentic AI Foundation, with contributions from Sourcegraph, OpenAI, Google, Cursor and Factory.
- 29 percent of developers trust AI tools according to Stack Overflow 2025, minus 11 percentage points year over year. The trust question is an architecture question.
- 53 percent of AI-using DACH companies fail according to Bitkom 2026 due to missing team skills. Skills are codified competence.
- 500 lines is the best-practice ceiling for a SKILL.md, beyond which performance drops.
- 1 tool with native CLAUDE.md support in 2026: Claude Code. With AGENTS.md support: Cursor, Copilot, Codex, Gemini CLI, Aider, Zed, Warp and others. Cross-tool strategy is the only future-proof option in 2026.
If you are an engineering lead or CTO in 2026 with coding agents in production and the output is still inconsistent, the cause is not the model and not the prompt. Model quality is so good in 2026 that the structure around the task matters more than the elegance of the task wording. That is exactly the trend Stack Overflow 2025 signalled with the 11-percentage-point trust drop: teams dump their coding standards into a system prompt and wonder why it does not scale.
At Sentient Dynamics we have rolled out a three-layer architecture in four DACH engagements in 2026 that addresses the trust problem at the root. The layers are: CLAUDE.md as the project truth, standalone Skills as modular best practices, Custom Commands as repeatable workflows. Augmented by the AGENTS.md layer standardised since 2026, which makes the entire architecture cross-tool. This post delivers the architecture, the migration-path model and the five most common anti-patterns we see in DACH mid-market.
Who this post is for and who it is not
This post is for engineering leads and CTOs at DACH mid-market companies with 50 to 2,000 developers who have coding agents in production (Cursor, Copilot, Claude Code, Codex or in-house) and are wrestling with output drift: a senior gets clean code, a junior gets hallucinated imports, no one knows why.
Not a fit for solo developers or greenfield teams without an established tool stack. For those a lean CLAUDE.md without a skill architecture is enough, because the complexity of the three-layer solution starts paying off from 5+ devs and 2+ production repos.
Why context architecture matters more than prompting in 2026
The most interesting phenomenon in AI coding in 2026 is the shift in the competence axis: prompting technique is still relevant, but it is no longer the decisive factor for consistent output. What matters is context architecture, the way you hand the agent project knowledge, conventions and workflows in a structured form.
Three observations from our 2026 engagements:
Single-prompt scaling fails beyond 200 lines. Teams that pack their coding standards into a system prompt see a performance erosion beyond about 200 prompt lines: the agent ignores earlier instructions because attention sits on fresh tokens. That is consistent with Anthropic's official best practice for SKILL.md (ceiling 500 lines) and the recommendation to keep CLAUDE.md lean.
Hooks are deterministic, CLAUDE.md is advisory. When an action must always happen with zero exceptions (e.g. tests before commit), it belongs in a hook, not in CLAUDE.md. CLAUDE.md is recommendation, hook is command. Teams that grasp this separation see significantly less compliance drift in code reviews.
Skills load on demand and offload tokens. Domain knowledge that is only sometimes relevant (migration patterns, test-generation templates, bug-reproduction workflows) belongs in standalone Skills in .claude/skills/, not in CLAUDE.md. The agent loads them automatically when the trigger description matches, or explicitly via /skill-name slash command.
The three layers in detail
Layer 1: CLAUDE.md as project truth. The CLAUDE.md is the always-loaded project constant. It carries three things: stack overview (which languages, frameworks, build tools), the two to four most important conventions (e.g. "all DB calls through the repository layer", "no direct console.log"), and pointers to important Skills or Custom Commands. Best practice: per line ask "would Claude make mistakes if I removed this?". If not, drop it.
Layer 2: Skills as modular best practices. Skills live in the .claude/skills/ directory, each with a SKILL.md that stays under 500 lines. Examples from our engagements: a test-generation Skill that codifies pytest patterns for a specific domain; a doc-generation Skill that produces README stubs to team standard; a bug-reproduction Skill that translates an issue description into a reproducible test case. Skills carry a clear trigger description so the agent loads them autonomously, or get called explicitly via slash command.
Layer 3: Custom Commands as repeatable workflows. Custom Commands are defined in .claude/commands/ and encapsulate a multi-step workflow as a single slash command. Example: /release-prep runs tests automatically, builds changelog entries from recent commits, updates the version file, opens a release PR. Custom Commands are the operational expression of Skills: where Skills carry the knowledge, Commands carry the workflow.
The three layers work together without overlap. CLAUDE.md delivers context, Skills deliver knowledge, Commands deliver action. When one of the layers is missing, teams typically compensate with bloat in the other layers. Frequent anti-pattern: an 800-line CLAUDE.md that should really be Skills.
The AGENTS.md standard and what it means for DACH teams
In December 2025 the Linux Foundation initiative Agentic AI Foundation took stewardship of the AGENTS.md standard (in active use across vendors throughout 2025), with contributions from Sourcegraph, OpenAI, Google, Cursor and Factory. AGENTS.md is tool-agnostic and is natively supported by Cursor, Copilot, Codex, Gemini CLI, Windsurf, Aider, Zed, Warp, RooCode and a growing list of other agents.
As of April 2026 Claude Code does not natively support AGENTS.md. The pragmatic workaround we deploy in our engagements: write AGENTS.md as the single source of truth, then symlink to CLAUDE.md (ln -s AGENTS.md CLAUDE.md). This way the file works in Claude Code, Cursor and Copilot alike, without duplication.
What does that mean strategically for DACH engineering teams? Three points:
Avoid tool lock-in. Anyone in 2026 betting exclusively on CLAUDE.md is locking themselves into Claude Code. If Anthropic later changes pricing, alters EU hosting or pulls the tool from the market, the skill architecture is not portable. AGENTS.md turns a tool migration into a three-day task instead of a three-month migration.
Cover multi-tool engagements. In engagements with 100+ devs we routinely see multi-tool setups in 2026: senior devs on Claude Code, mainstream engineers on Copilot, greenfield teams on Cursor. A unified AGENTS.md ensures the conventions are the same regardless of the tool.
Account for vendor consolidation in the pipeline. The AI coding-agent landscape is consolidating. AGENTS.md is the insurance against the next market shake-out, because the convention is anchored in the open standard, not the vendor.
Which tool for which setup? Read our Cursor vs Copilot vs Claude Code comparison →
Sample architecture from a Sentient engagement
In Q1 2026 we rolled out the three-layer architecture at a German machinery manufacturer with 80 engineering FTEs. Before: a 1,200-line system prompt maintained twice across Cursor rules and CLAUDE.md, with 30 percent drift between the two files. After: an AGENTS.md with 280 lines, seven Skills in .claude/skills/, three Custom Commands for release, bugfix and refactoring.
The skill library in detail:
test-generation-pytest: pytest test cases to internal domain standard, with fixture patterns for the main modules.doc-generation-readme: README stubs in team standard, with sections for setup, usage, architecture, troubleshooting.bug-repro-from-issue: translate an issue description into a reproducible test case, including mock-data template.migration-postgres-version: PostgreSQL schema migration patterns including backwards-compatibility checks.legacy-cobol-bridge: wrapper patterns for a COBOL mainframe integration in the brownfield setup.compliance-audit-trail: logging patterns for AI-Act-compliant audit trails (see our AI Act post).frontend-component-test: component test patterns with mock strategies for the React codebase.
The three Custom Commands:
/release-prep: tests, changelog, version bump, release PR./bug-fix: read issue, pull bug-repro Skill, create fix PR, write tests./refactor-module: module refactoring along the pattern library, with pre- and post-test verification.
Results after 90 days in production: 1.8x cycle-time speedup in the modules where the Skills fired, measured via DORA lead time. Pull-request compliance with team conventions rose from 62 percent (measured before rollout) to 91 percent (measured 90 days after rollout). Senior-dev scepticism, which started high, flipped on day 14 when the first two Custom Commands visibly worked down the refactoring backlog.
Five common anti-patterns we see in DACH 2026
Anti-pattern 1: The 1,200-line CLAUDE.md. Teams packing every convention, standard and process into a single CLAUDE.md. Result: performance erosion, the agent ignores older instructions, output goes inconsistent. Fix: migrate everything that is not relevant in every project touch into Skills.
Anti-pattern 2: Skills without a trigger description. Teams place Skills in .claude/skills/ but forget the descriptive header the agent needs for auto-loading. Result: Skills are only usable via slash command, the value never compounds. Fix: every SKILL.md starts with a clear "when to use" description in the frontmatter.
Anti-pattern 3: Hooks missing. Teams treat deterministic requirements (e.g. tests before commit, lint check before push) as CLAUDE.md recommendations. Result: drift in pull requests, the compliance consultant in the Q2 audit finds the gap. Fix: anything that must happen 100 percent of the time goes into hooks.
Anti-pattern 4: Tool-specific lock-in. Teams write CLAUDE.md without an AGENTS.md symlink and end up nailed into Claude Code in 2026. Result: tool migration becomes a three-month migration. Fix: AGENTS.md as single source of truth, symlinked to CLAUDE.md.
Anti-pattern 5: Credentials inside skill files. API keys, tokens, secrets in the SKILL.md body, with the argument "the agent has to know it". Result: credentials in git history, compliance audit findings, six-figure rotation cost. Fix: reference credentials only via environment variables or secrets manager.
Pre-production checklist
Before you take a three-layer architecture into production, these five points should be ticked off in writing:
- CLAUDE.md / AGENTS.md under 300 lines, with a clear stack overview, 2 to 4 top conventions and pointers to Skills.
- Skills with trigger description in the frontmatter, each under 500 lines, clear scope boundary.
- Custom Commands with workflow doc, success and error case named, tests integrated.
- Hooks for deterministic requirements (tests, lint, type check) configured and tested.
- Credentials isolation through environment variables or secrets manager, not a single secret in the skill body.
If one of the five points is not ticked, the architecture goes back into rework. The inconsistency that follows costs more than two extra days of polish.
Frequently asked questions
Can we maintain AGENTS.md and CLAUDE.md in parallel? Technically yes, organisationally a drift trap. We recommend a single source of truth in AGENTS.md, symlinked to CLAUDE.md. That way the team maintains one file and both tools see the same convention.
How long does the first skill-library setup take for a 50-FTE team? In our engagements typically two to three weeks for the first five Skills, Custom Commands and hooks. The skill library grows iteratively per sprint after that, with refinement sessions every 14 days.
What about Cursor rules, do we still need them on top? As of April 2026: Cursor supports AGENTS.md natively, Cursor rules are maintained in parallel as a tool-specific extension. If you only use Cursor, .cursorrules plus AGENTS.md is enough. If you run multi-tool, AGENTS.md plus tool-specific extensions where needed.
How do we measure the success of the three-layer architecture? Three KPIs from our engagement practice: DORA lead time per size unit (see our KPI framework post), pull-request compliance rate with team conventions, and time-to-productive for new devs. The latter drops 30 to 50 percent in our experience once the skill library is in place.
Do we need subagents on top of Skills? Subagents in .claude/agents/ are a special pattern for isolated tasks that read many files or need their own context. Example: a "code-reviewer" subagent that reviews a PR in isolation without flooding the main context. We deploy subagents from team size 100+ devs, below that Skills are enough.
How does this fit with AI Act compliance? Skills are codified engineering competence verifiably documented in the AI Act audit. Compliance audit trails as a Skill plus an audit-logging hook produce a clean, auditable stack. Details in our AI Act 90-day plan.
What does the three-layer stack bring a 5-dev team? Limited value. With 5 devs a lean AGENTS.md with two or three Skills is enough. The complexity of the full architecture pays off from 15+ devs, multiple repos or multi-tool setup.
Sources
- Claude Code Best Practices (Anthropic)
- Skill authoring best practices (Anthropic)
- Writing a good CLAUDE.md (HumanLayer)
- .cursorrules vs CLAUDE.md vs AGENTS.md 2026 (The Prompt Shelf)
- AGENTS.md vs CLAUDE.md Cross-Tool Standard (Hivetrail)
- Complete Guide CLAUDE.md AGENTS.md 2026 (Data Science Collective)
- Claude Code Skills vs Cursor Rules vs Codex Skills (Agensi)
- 12 Patterns Agentic Engineers Use (Level Up Coding)
- Stack Overflow Developer Survey 2025: AI trust
- Bitkom AI Study 2026 (PDF)
About the author
Sebastian Lang is co-founder of Sentient Dynamics and leads the Agentic University programme. Before Sentient he was responsible for AI workforce programmes at SAP's Strategy Practice, with 15+ years of engineering leadership experience. Sentient Dynamics works on a success-based compensation model and is deployed across the SHD and Bregal portfolios.
About the author
Sebastian Lang
Co-Founder · Business & Content Lead
Co-Founder von Sentient Dynamics. 15+ Jahre Business-Strategie (u.a. SAP), MBA. Schreibt über AI-Act-Compliance, ROI-Messung und wie Mittelstand-CTOs agentische KI tatsächlich einführen.