Static analysis, dynamic analysis, and stochastic analysis
For a long time programmers have had two types of program verification tools, static analysis (like a compiler's checks) and dynamic analysis (running a test suite). I find myself using LLMs to analyze newly written code more and more. Even when they spit out a lot of false positives, I still find them to be a massive help. My workflow is something like this:
- Commit my changes
- Ask Claude Opus "Find problems with my latest commit"
- Look though its list and skip over false positives.
- Fix the true positives.
git add -A && git commit --amend --no-edit- Clear Claude's context
- Back to step 2.
I repeat this loop until all of the issues Claude raises are dismissable. I know there are a lot of startups building a SaaS for things like this (CodeRabbit is one I've seen before, I didn't like it too much) but I feel just doing the above procedure is plenty good enough and catches a lot of issues that could take more time to uncover if raised by manual testing.
It's also been productive to ask for any problems in an entire repo. It will of course never be able to perform a completely thorough review of even a modestly sized application, but highlighting any problem at all is still useful.
Someone recently mentioned to me that they use vision-capable LLMs to perform "aesthetic tests" in their CI. The model takes screenshots of each page before and after a code change and throws an error if it thinks something is wrong.
Haha... it really is useful. Tip: have the model write to a file with notes about skipped/dismissed items so it doesn't re-surface them on the next run.
I want it to be completely fresh on each run. I would rather re-read the same complaint 3 times than have different runs poison each others' context.
I have some common patterns for instructing Claude to output results to a file. I both agree with you that it gets a little hinky without the clear context step, and I would push back a tiny bit.
There are many useful ways of retaining the context you want while.effectively ensuring the LLM dismisses what you want, without draining all the context.
Your process of clearing context is useful and good for the loop you have.
My two cents would be to consider adding another loop/pattern.
Work with Claude, or start with your own effort, to describe and summarize your codebase into a ./repo/.claude/CLAUDE.md file. When you start up Claude in ./repo, Claude will automatically include that summary file in its context. You can then potentially save a few steps each time you clear the context and ask claude to loop for problems in your commit.
If you end up liking that new loop, ask claude to output a simple new skill, referenced as /verify-commit or similar. Let it basically output the skill and show you how to use it. Ask it what might be good semantic options to provide for the skill at the jump, versus prompting you within the run.
Both of these are similar base concepts we use hundreds of times a day across a tiny team of engineers. Really good stuff.