Kentucky Derby முடிவுகள்

Kentucky Derby-யில் வென்றவர் யார்?

Agent Skills

Agent Skills


A senior engineer’s job is mostly the parts that don’t show up in the diff. Specs. Tests. Reviews. Scope discipline. Refusing to ship what can’t be verified. AI coding agents skip those parts by default. Agent Skills is my attempt to make them not optional.


The default behaviour of any AI coding agent is to take the shortest path to “done.” Ask for a feature and it writes the feature. It does not ask whether you have a spec, write a test before the implementation, consider whether the change crosses a trust boundary, or check what the PR will look like to a reviewer. It produces code, declares victory, and moves on.

This is the same failure mode every senior engineer has spent their career learning to avoid. The senior version of any task includes work that doesn’t show up in the diff: surfacing assumptions, writing the spec, breaking the work into reviewable chunks, choosing the boring design, leaving evidence that the result is correct, sizing the change so a human can actually review it. Those steps are most of what separates engineers who ship reliable software at scale from people who push code that breaks.

Agents skip those steps for the same reason any junior would. They’re invisible. The reward signal points at “task complete” not “task complete and the design doc exists.” So we have to bolt the senior-engineer scaffolding back on.

Agent Skills is my attempt at that scaffolding. It just crossed 26K stars, so apparently I’m not alone in wanting it. This post is the part the README doesn’t quite cover: why each design choice exists, how it maps onto standard SDLC and Google’s published engineering practices, and what you should steal from the project even if you never install a single skill.


What a “skill” actually is

The word “skill” is doing a lot of work in the Claude Code / Anthropic vocabulary, and it helps to be precise. A skill is a markdown file with frontmatter that gets injected into the agent’s context when the situation calls for it. Somewhere between a system-prompt fragment and a runbook.

A skill is not reference documentation. It is not “everything you should know about testing.” It is a workflow: a sequence of steps the agent follows, with checkpoints that produce evidence, ending in a defined exit criterion.

That distinction is the whole game. If you put a 2,000-word essay on testing best practices into the agent’s context, the agent reads it, generates plausible-looking text, and skips the actual testing. If you put a workflow there (write the failing test first, run it, watch it fail, write the minimum code to pass, watch it pass, refactor), the agent has something to do, and you have something to verify.

Process over prose. Workflows over reference. Steps with exit criteria over essays without them. That single distinction separates a useful skill from a pretty markdown file. It also explains why so many “AI rules” repos end up doing nothing in practice. The rules are essays.


The SDLC the skills encode

The twenty skills in the repo organise around six lifecycle phases, with seven slash commands sitting on top. Define (/spec) is where you decide what you’re actually building. Plan (/plan) breaks the work down. Build (/build) implements it in vertical slices. Verify (/test) proves it works. Review (/review) catches what slipped through. Ship (/ship) gets it to users safely. /code-simplify sits across the bottom of the whole thing.

This isn’t a coincidence. It’s the same SDLC every functioning engineering organisation runs, just in different vocabulary. Google calls it design doc → review → implementation → readability review → launch checklist. Amazon calls it the working-backwards memo and the bar raiser. Every healthy team has some version of this loop.

What’s new with AI coding agents is that most agents skip most of these phases by default. You ask for a feature, you get an implementation, and the spec, plan, tests, review, and launch checklist all just don’t happen. Skills push the agent through the same phases a senior engineer forces themselves through, because shipping the code without them is how you produce incidents.

A complex feature might activate eleven skills in sequence. A small bug fix might use three. The router (using-agent-skills) decides which apply. The point is that the workflow scales to the actual scope, not to the assumed scope.


Five principles that are doing the work

Five design decisions in the project are the load-bearing ones. The rest of the system follows from them.

1. Process over prose

Already covered. Workflows are agent-actionable; essays are not. The same is true for human teams. If your team handbook is 200 pages, no one reads it under time pressure. If it’s a small set of workflows with checkpoints, people actually run them.

2. Anti-rationalization tables

This is the most distinctive design decision in the project, and the one I most want other teams to steal.

Each skill includes a table of common excuses an agent (or a tired engineer) might use to skip the workflow, paired with a written rebuttal. A few examples close to the originals:

  • “This task is too simple to need a spec.” → Acceptance criteria still apply. Five lines is fine. Zero lines is not.
  • “I’ll write tests later.” → Later is the load-bearing word. There is no later. Write the failing test first.
  • “Tests pass, ship it.” → Passing tests are evidence, not proof. Did you check the runtime? Did you verify user-visible behaviour? Did a human read the diff?

The reason this works is that LLMs are excellent at rationalisation. They will produce a plausible-sounding paragraph explaining why this particular task doesn’t need a spec, or why this particular change is fine to merge without review. Anti-rationalization tables are pre-written rebuttals to lies the agent hasn’t yet told.

The pattern is just as good for human teams. Most engineering decay isn’t anyone choosing to do bad work. It’s people accepting plausible-sounding justifications for skipping the parts they don’t feel like doing. A team that writes down its anti-rationalizations is a team that has fewer of them.

3. Verification is non-negotiable

Every skill terminates in concrete evidence. Tests pass. Build output is clean. The runtime trace shows the expected behaviour. A reviewer signs off. “Seems right” is never sufficient.

This is the same principle that makes Anthropic’s harness recover from failures, that makes Cursor’s planner/worker/judge split actually catch bugs, that makes any long-running agent recoverable. The agent is a generator. You need a separate signal that the work is done. Skills bake that signal into every workflow.

4. Progressive disclosure

Do not load all twenty skills into context at session start. Activate them based on the phase. A small meta-skill (using-agent-skills) acts as a router that decides which skill applies to the current task.

This is the harness engineering lesson applied at skill granularity. Every token loaded into context degrades performance somewhere, so you load what’s relevant and leave the rest on disk. Progressive disclosure is how you get a twenty-skill library into a 5K-token slot without poisoning the well.

5. Scope discipline

The meta-skill encodes a non-negotiable I’d staple to every agent if I could: “touch only what you’re asked to touch.” Don’t refactor adjacent systems. Don’t remove code you don’t fully understand. Don’t brush against a TODO and decide to rewrite the file.

This sounds obvious until you watch an agent decide that fixing one bug requires modernizing three unrelated files. Scope discipline is the single biggest determinant of whether an agent’s PR is mergeable or has to be unwound. It’s also the principle that maps most cleanly onto Google’s code review norms, where reviewers will block a PR for doing more than one thing.


The Google DNA

The skills are saturated with practices from Software Engineering at Google and Google’s public engineering culture. This is intentional. Most of what makes Google-scale software work is documented and public, and it is exactly the part agents are most likely to skip.

A partial map of which skill encodes which practice:

  • Hyrum’s Law in api-and-interface-design. Every observable behaviour of your API will eventually be depended on by someone, so design with that in mind.
  • The test pyramid (~80/15/5) and the Beyoncé Rule in test-driven-development. “If you liked it, you should have put a test on it.” Infrastructure changes don’t catch bugs; tests do.
  • DAMP over DRY in tests. Google’s testing philosophy is explicit that test code should read like a specification even at the cost of some duplication. Over-abstracted tests are a known anti-pattern.
  • ~100-line PR sizing, with Critical / Nit / Optional / FYI severity labels in code-review-and-quality. Straight from Google’s code review norms. Big PRs don’t get reviewed; they get rubber-stamped.
  • Chesterton’s Fence in code-simplification. Don’t remove a thing until you understand why it was put there.
  • Trunk-based development and atomic commits in git-workflow-and-versioning.
  • Shift Left and feature flags in ci-cd-and-automation. Catch problems as early as possible, decouple deploy from release.
  • Code-as-liability in deprecation-and-migration. Every line you keep is one you have to maintain forever, so prefer the smaller surface.

None of these are new ideas. The point is that none of them are in the agent by default. A frontier model has read the phrase “Hyrum’s Law” in its training data, but it does not apply Hyrum’s Law when it’s designing your API at 3am. Skills are how you make sure it does.


How to actually use it

Three modes, in roughly increasing commitment.

Mode 1: install via marketplace. If you’re using Claude Code:

/plugin marketplace add addyosmani/agent-skills
/plugin install agent-skills@addy-agent-skills

You get the slash commands (/spec, /plan, /build, /test, /review, /ship, /code-simplify) and the agent activates the relevant skills automatically based on context. This is the path I’d recommend most people start on.

Mode 2: drop the markdown into your tool of choice. The skills are plain markdown with frontmatter. Cursor users put them in .cursor/rules/. Gemini CLI has its own install path. Codex, Aider, Windsurf, OpenCode, anything that accepts a system prompt can read them. The tooling matters less than the workflow underneath.

Mode 3: read them as a spec. Even if you never install anything, the skills are a documented description of what good engineering with AI agents looks like. Read code-review-and-quality.md and apply the five-axis framework to your team’s review process. Read test-driven-development.md and use it to settle the next “do we need to write the test first” argument with a junior. Read the meta-skill and steal the five non-negotiables for your own AGENTS.md.

This third mode is where I’d actually start. Pick the four or five skills closest to your current pain. Decide which workflows you want enforced. Then install the runtime, or roll your own, to do the enforcing.


What to steal even if you never install

A few patterns from the project I’d steal regardless of whether you use AI coding agents at all.

Anti-rationalization as a team practice. Write down the lies your team tells itself. “We’ll fix the tests after launch.” “This change is too small for a design doc.” “It’s fine, we have monitoring.” Pair each with the rebuttal. Put it in your AGENTS.md or your engineering wiki. It will save you arguments and it will catch the next tired Friday-afternoon shortcut.

Process over prose for anything you write internally. If you find yourself writing a 2,000-word doc titled “how we approach X” you’ve written reference material. Convert it to a workflow with checkpoints. The doc shrinks to 400 words and people actually run it. This applies as much to onboarding guides and runbooks as it does to agent skills.

Verification as a hard exit criterion. Make “produce evidence” the exit step of every task. For agents, for engineers, for yourself. Evidence is whatever proves the work is done: a green test run, a screenshot, a log, a review approval. Without it, the task is not done. “Seems right” never closes the loop.

Progressive disclosure for any rulebook. Do not write a 50-page handbook. Write a small router that points to the right small chapter for the situation. This is true for AGENTS.md, for runbooks, for incident playbooks, for anything anyone will read under time pressure.

Five non-negotiables, lifted from the meta-skill, that I’d put in any AGENTS.md tomorrow:

  1. Surface assumptions before building. Wrong assumptions held silently are the most common failure mode.
  2. Stop and ask when requirements conflict. Don’t guess.
  3. Push back when warranted. The agent (or engineer) is not a yes-machine.
  4. Prefer the boring, obvious solution. Cleverness is expensive.
  5. Touch only what you’re asked to touch.

That’s a worthwhile engineering culture in five lines, and you don’t need to install anything to adopt it.


Where this fits in the harness

In the broader picture, skills are one layer of agent harness engineering. The harness is the model plus everything you build around it; skills are the reusable workflow chunks that get progressively disclosed into the system prompt. They sit alongside AGENTS.md (the rolling rulebook), hooks (the deterministic enforcement layer), tools (the actions the agent can take), and the session log (the durable memory). Each layer has a specific job. Skills do the senior-engineer-process job.

Skills matter more for long-running agents than they do for chat-style ones, because long runs amplify every shortcut. An agent that skips the test in a 10-minute session produces one bug. An agent that skips the test in a 30-hour session produces a debugging archaeology project at the end of the run, when no one remembers what the original intent was. The longer the run, the more the senior-engineer scaffolding has to be enforced rather than suggested.

The portability of the skills format matters too. The same SKILL.md file works in Claude Code, Cursor (with rules), Gemini CLI, Codex, and any other harness that accepts system-prompt content. Write the workflow once, the runtime enforces it. That’s the thing the markdown-with-frontmatter format buys you that bespoke prompt engineering does not.


Closing

The thing I most want people to take from this project, more than the skills themselves, is the framing.

AI coding agents are extremely capable junior engineers with no instinct for the parts of the job that don’t show up in the diff. The senior-engineering work (surfacing assumptions, sizing changes, writing the spec, leaving evidence, refusing to merge what can’t be reviewed) is exactly what an agent will skip unless you make it impossible to skip. The job, increasingly, is to encode that discipline as something the agent cannot talk itself out of.

Skills are one shape of that. Anti-rationalization tables. Progressive disclosure. Process over prose. Verification as the load-bearing exit criterion. The Google practices that already work, made portable.

You can install my version. You can roll your own. The lesson stands either way: the senior-engineer parts of the job are no longer optional, even when the engineer is a model.


The repo is at github.com/addyosmani/agent-skills (MIT). For the broader scaffolding picture, see Agent Harness Engineering and Long-running Agents.

Leave a Reply

Your email address will not be published. Required fields are marked *

肯塔基德比结果 Cherie Devaux Golden Tempo 赛马 Golden Tempo 谁赢得了肯塔基德比 2026年肯塔基德比结果 肯塔基德比冠军 骑师 Jose Ortiz Jose Ortiz Golden Tempo 赔率 肯塔基德比历届冠军 德比结果 谁拥有 Golden Tempo 德比冠军 2026年普利克尼斯锦标赛 Golden Tempo 马主 2026年德比结果 赛马 Golden Tempo 赔率 Cherie Devaux 的丈夫 哪匹马赢得了肯塔基德比 谁赢得了肯塔基德比 Daisy Phipps Pulito Golden Tempo 的骑师 谁拥有赛马 Golden Tempo Phipps 马房 Golden Tempo 的赔率是多少 谁赢得了德比 肯塔基德比直播 今天谁赢得了肯塔基德比 肯塔基德比的喜悦 肯塔基 肯塔基德比赛程多长 Golden Tempo 与肯塔基德比 德比历届冠军 肯塔基德比完赛情况 2026年肯塔基德比直播 肯塔基德比包含多少场比赛 肯塔基德比回放 赛马 Danon Bourbon 今日赛马赛事 Danon Bourbon 谁赢得了2026年德比 上一位三冠王得主 2026年肯塔基德比今日结果 Golden Tempo 的练马师 肯塔基德比开赛时间 2026年肯塔基德比冠军 赛马 Golden Tempo 血统 Irad Ortiz Jr. Golden Tempo 的马主 2026年肯塔基德比完赛情况 肯塔基德比最终结果 谁赢得了2025年肯塔基德比 Cheri Devaux Vincent Viola 肯塔基德比冠军得主 是哪匹马赢得了肯塔基德比 2026年普利克尼斯锦标赛何时举行 普利克尼斯锦标赛 肯塔基德比女性练马师 赛马 Golden Tempo 的马主 练马师 Cherie Devaux 谁赢得了德比 今日德比结果 德比直播 肯塔基德比结果 Cherie Devaux 的年龄 今日赛马比赛 2026年肯塔基德比回放 Irad Ortiz 德比完赛情况 肯塔基德比完整结果 肯塔基德比排名 第152届肯塔基德比 2026年肯塔基德比完赛顺序 德比完赛顺序 练马师在肯塔基德比中能获得多少奖金德比大赛 2026年普瑞克尼斯锦标赛 Cherie Devaux 的子女 肯塔基德比冠军 Daisy Phipps 肯塔基德比直播 肯塔基德比持续多久 肯塔基德比首位女性练马师 2026年德比赛果 “Golden Tempo”的马主是谁 今天谁赢得了德比大赛 2026年肯塔基德比在线直播 2026年肯塔基德比完整赛果 2026年肯塔基德比视频 “Golden Tempo”的马主们 “Albus”参加肯塔基德比 谁赢得了2025年肯塔基德比 “Golden Tempo”的赔率 肯塔基德比冠军能获得什么奖励 肯塔基德比的赛马跑得有多快 肯塔基德比的骑师们 “Ocelli”参加肯塔基德比 骑师 Jose Ortiz 肯塔基德比在哪里举行 赛马运动 “Golden Tempo”今天的赔率 肯塔基德比中的女性骑师 肯塔基德比的赛程长度 2026年肯塔基德比完赛情况 “So Happy”在肯塔基德比中获得了第几名 2026年肯塔基德比“三连胜”赔付金额 肯塔基德比现场直播 Ortiz 兄弟骑师组合 2026年肯塔基德比“三连胜”赔付 骑师 Renegade 普瑞克尼斯锦标赛 “So Happy”在肯塔基德比中获得了什么名次 “Six Speed”参加肯塔基德比 2026年肯塔基德比排名 肯塔基德比完赛顺序 2026年肯塔基德比直播流 Jose Ortiz 策骑的德比赛马 2026年肯塔基德比名次排列 2026年肯塔基德比完赛顺序 肯塔基德比完赛排名 Jose Ortiz 的兄弟 St Elias 马房 谁赢得了肯塔基德比?佛罗里达美洲豹队老板 Golden Tempo 的号码 Cherie Devaux 的家人 Golden Tempo 的骑师 Golden Tempo 的参赛号码是多少 Golden Tempo 在肯塔基德比的骑师 Golden Tempo 的派彩金额 肯塔基德比的比赛结果 肯塔基德比结果 Golden Tempo 的血统 2026年肯塔基德比冠军 肯塔基德比冠军能赢得多少奖金 肯塔基德比21号赛马 肯塔基德比女性骑师 肯塔基德比结果 2026年肯塔基德比最终结果 谁赢得了2026年德比 肯塔基德比的赛程长度 23赔1的赔率派彩 肯塔基德比官方结果 2026年肯塔基德比直播观看 Daisy Pulito Jose L. Ortiz 肯塔基德比的结果 2026年肯塔基德比与 Golden Tempo 肯塔基德比与 Golden Tempo 肯塔基德比派彩金额 Golden Tempo 在肯塔基德比的赔率 肯塔基德比结果 今日肯塔基德比冠军 肯塔基德比完赛名次 肯塔基德比奖金收入 今日肯塔基德比结果 2026年肯塔基德比女性骑师 2026年肯塔基德比比赛回放 谁赢得了肯塔基德比 谁刚刚赢得了肯塔基德比 Golden Tempo 的赛马号码 2026年肯塔基德比排名 女性骑师 Golden Tempo 的赔率是多少 肯塔基德比的最快纪录时间 谁赢得了2026年肯塔基德比 Golden Tempo 的赔率是多少 肯塔基德比中有女性骑师参赛吗 肯塔基德比历届冠军名单 肯塔基德比比赛结果 肯塔基德比冠军能获得什么奖励 Golden Tempo 赛马的主人 肯塔基德比最终排名 谁赢得了肯塔基德比 Golden Tempo 的赔率 肯塔基德比的冠军们 肯塔基德比结果 2026年肯塔基德比结果 谁拥有 Golden Tempo 这匹马 肯塔基德比冠军们 肯塔基德比完赛时间 2026年肯塔基德比中的女性骑师 Golden Tempo 的赔率是多少 Jose Ortiz 与肯塔基德比 肯塔基德比“三连胜”派彩历史 Golden Tempo 的练马师 肯塔基德比比赛视频 2026年肯塔基德比完整结果 第一届肯塔基德比是何时举办的 今日肯塔基德比结果 Jose奥尔蒂斯骑师兄弟 肯塔基德比参赛马匹编号 2026年德比大赛成绩 “神奇迪恩”(Wonder Dean)——肯塔基德比 2026年肯塔基德比三重彩派奖金额 肯塔基德比参赛马匹的年龄 何塞·奥尔蒂斯(Jose Ortiz)身在何处 Kentucky Derby முடிவுகள் Cherie Devaux Golden Tempo Golden Tempo குதிரை Kentucky Derby-யில் வென்றவர் யார்? Kentucky Derby 2026 முடிவுகள் Kentucky Derby வெற்றியாளர் Jose Ortiz (குதிரை ஓட்டுநர்) Jose Ortiz Golden Tempo-வின் வெற்றி வாய்ப்புகள் (Odds) Kentucky Derby வெற்றியாளர்கள் Derby முடிவுகள் Golden Tempo-வின் உரிமையாளர் யார்? Derby வெற்றியாளர் Preakness 2026 Golden Tempo உரிமையாளர் Derby முடிவுகள் 2026 Golden Tempo குதிரையின் வெற்றி வாய்ப்புகள் Cherie Devaux-வின் கணவர் Kentucky Derby-யில் எந்தக் குதிரை வென்றது? Kentucky Derby-யில் வென்றவர் யார்? Daisy Phipps Pulito Golden Tempo-வின் குதிரை ஓட்டுநர் Golden Tempo குதிரையின் உரிமையாளர் யார்? Phipps Stable Golden Tempo-வின் வெற்றி வாய்ப்புகள் என்னவாக இருந்தன? Derby-யில் வென்றவர் யார்? Kentucky Derby நேரலை இன்று Kentucky Derby-யில் வென்றவர் யார்? Kentucky Derby மகிழ்ச்சித் தருணங்கள் Kentucky Kentucky Derby பந்தயத் தூரம் எவ்வளவு? Golden Tempo Kentucky Derby Derby வெற்றியாளர்கள் Kentucky Derby நிறைவு Kentucky Derby 2026 நேரலை Kentucky Derby-யில் எத்தனை பந்தயங்கள் நடைபெறும்? Kentucky Derby மறுஒளிபரப்பு Danon Bourbon குதிரை இன்றைய குதிரைப் பந்தயம் Danon Bourbon 2026 Derby-யில் வென்றவர் யார்? கடைசியாக Triple Crown வென்றவர் Kentucky Derby 2026 இன்றைய முடிவுகள் Golden Tempo-வின் பயிற்சியாளர் Kentucky Derby தொடங்கும் நேரம் Kentucky Derby 2026 வெற்றியாளர்கள் Golden Tempo குதிரையின் வம்சாவளி விவரங்கள் Irad Ortiz Jr. Golden Tempo-வின் உரிமையாளர் Kentucky Derby 2026 நிறைவு வரிசை Kentucky Derby இறுதி முடிவுகள் 2025 Kentucky Derby-யில் வென்றவர் யார்? Cherie Devaux Vincent Viola Kentucky Derby வெற்றியாளர் Kentucky Derby-யில் எந்தக் குதிரை வென்றது? Preakness 2026 எப்போது நடைபெறும்? Preakness Kentucky Derby-யின் பெண் பயிற்சியாளர் Golden Tempo குதிரையின் உரிமையாளர் Cherie Devaux (பயிற்சியாளர்) Derby-யில் வென்றவர் யார்? இன்றைய Derby முடிவுகள் Derby நேரலை Kentucky Derby முடிவுகள் Cherie Devaux-வின் வயது இன்றைய குதிரைப் பந்தயம் Kentucky Derby 2026 மறுஒளிபரப்பு Irad Ortiz Derby நிறைவு Kentucky Derby முழுமையான முடிவுகள் Kentucky Derby தரவரிசைகள் 152-வது Kentucky Derby Kentucky Derby 2026 நிறைவு வரிசை Derby நிறைவு வரிசை Kentucky Derby-யில் பயிற்சியாளருக்கு எவ்வளவு பரிசுத் தொகை கிடைக்கும்? டெர்பி Preakness Stakes 2026 Cherie Devaux-வின் குழந்தைகள் Kentucky Derby வெற்றியாளர் Daisy Phipps Kentucky Derby நேரலை Kentucky Derby போட்டி எவ்வளவு நேரம் நீடிக்கும்? Kentucky Derby-யின் முதல் பெண் பயிற்சியாளர் 2026 டெர்பி முடிவுகள் Golden Tempo குதிரையின் உரிமையாளர் யார்? இன்று டெர்பி போட்டியில் வென்றவர் யார்? Kentucky Derby 2026 நேரலை ஒளிபரப்பு Kentucky Derby 2026 முழு முடிவுகள் Kentucky Derby 2026 காணொளி Golden Tempo குதிரையின் உரிமையாளர்கள் Albus (Kentucky Derby) Kentucky Derby 2025-இல் வென்றவர் யார்? Golden Tempo குதிரையின் வெற்றி வாய்ப்புகள் (Odds) Kentucky Derby வெற்றியாளருக்கு என்ன பரிசு கிடைக்கும்? Kentucky Derby-யில் குதிரைகள் எவ்வளவு வேகத்தில் ஓடும்? Kentucky Derby ஜாக்கிகள் (Jockeys) Ocelli (Kentucky Derby) ஜாக்கி Jose Ortiz Kentucky Derby எங்கு நடைபெறுகிறது? குதிரைப் பந்தயம் Golden Tempo-வின் இன்றைய வெற்றி வாய்ப்புகள் Kentucky Derby-யில் பங்கேற்ற பெண் ஜாக்கி