AI
This is Why AI Could Replace Programmers
- What makes you better than another terminal window? You’re competing with RAM and GPUs if you don’t understand the code.
- The differentiating factor between you an another guy with a terminal window is that the process is a speed up process, not a knowledge process.
- If you know the language and how it’s used, you’ll be in a much better place when using LLMs.
- Now is the most important time to learn.
- The tool is fast and efficient but it’s not smart and is not capable and trustworthy enough to ship code on its own.
- LLMs are best at React, Node, Tailwind, Python.
- Work in small batches, don’t try to zero-shot things.
Everything We Got Wrong About Research-Plan-Implement
- Fair warning, this talk is not very cohesive it should probably have been 60 minutes rather than 30.
- Don’t give AI too many instructions because it will not be able to faithfully honor them all.
- 10x speed doesn’t matter if it’s slop. Shoot for 2-3x.
- Avoid too many MCPs and tools; they fill up the context with instructions for stuff you may not use.
- RPI -> question, research, design, structure, plan, work tree, implement, PR. He calls this QRSPI (pronounced “crispy”).
- Alignment is the first section (question, research, design, structure, plan), and execution is the second one (work tree, implement, PR).
- There’s more focus on getting the planning correct before AI writes tons of code that isn’t aligned.
- The video was very hand-wavy and the diagrams were difficult to read (if at all). Based on this other writeup, here is what I think this means. Each step gets its own context window. Use HITL to course correct after each step.
- Question — give the ticket and codebase to AI so that it can come up with targeted questions it needs answers to before it can proceed (i.e., identify what the agent doesn’t know).
- Research — give it only the questions from the Q step (hiding the ticket) and have it produce a record of what the code does (no planning, no recommendations, no opinions). “Work back and forth with me, starting with your open questions and outline before writing the plan.”
- Design discussion — give AI the research and ask it to “brain dump” a ~200-line markdown file with current state, desired state, patterns to follow, and design decisions. This step answers “where we’re going.”
- Structure outline — define “how we get there” by breaking things down into tasks. This is not the exact code to write, but an overall order of the changes and how to test things along the way. Make it work in small vertical slices, because otherwise models seem to like making horizontal plans — e.g., do all the DB stuff, then all the service stuff.
- Plan — the full implementation details based on the structure outline
- Work tree — AI organizes tasks into a hierarchy where each branch maps to a testable unit of work
- Implement — AI agents write the code
- Pull request — a human reviews the code (not optional)
- Craft is about having high standards and being proud of your work
- Care whether others get value out of your work
- Avoid getting into the race of producing low-quality work the fastest
- Are you serving the user or yourself (or your company)?
The 7 Skills You Need to Build AI Agents
- Prompt engineering made sense two years ago — get the model to respond. Now agents can do things.
- The prompt is the recipe, but the agent is the chef.
- Skill 1: system design (LLM, tools, state management, subagents).
- Skill 2: tool and contract design (agents use tools, tools have contracts). Those contracts need to be precise, otherwise you get hallucinations.
- Skill 3: retrieval engineering (use RAG to get relevant docs, chunks, embedding models, re-ranking).
- Skill 4: reliability engineering because agents use tools that fail (retry logic, timeouts, fallback paths, circuit breakers)
- Skill 5: security and safety (prompt injections, least privilege (permission boundaries), input validation, output filters))
- Skill 6: evaluation and observability (tracing, test cases with known answers, metrics, automated tests to catch regressions)
- Skill 7: product thinking (agents serve humans, user experience design for unpredictable systems)
Testing
100% Test Coverage is a LIE, Here’s Why
- Automated functional tests give us feedback and confidence we can ship it.
- If tests are good and 0% coverage is bad, then why shouldn’t we aim for 100%?
- 1: Test coverage can struggle to take in to account multiple Boolean predicates on a single line. (False negative.)
- 2: If a line is considered covered, it may not be covered for all circumstances. (False negative.)
- Test coverage is a metric, so you get what you measure. You can have a test suite with 100% coverage that fails to give you fast feedback (true negatives).
- E2E tests tend to cover lots of parts of the app, so this could bump up coverage. Cons: slower, harder to isolate where the real issue is if it fails.
- Test coverage isn’t entirely useless. Seeing it trend down could signal other scenarios (e.g., removing slow tests in areas that aren’t important (OK), commenting out flaky tests (not OK)).
- Another idea is focusing on coverage for unit tests.
- See also: mutation testing. Compute-expensive to run, though.
- See also: property testing (creating boundary tests based on the types of inputs). Maybe this is similar to combination testing?