I have been thinking about this a lot lately.
As AI tools become part of daily development, generating code is getting easier. But reviewing that code properly is still very important.
I want to learn how other developers handle this.
When AI generates code for you, what review process do you follow before you keep it?
I am interested in questions like:
- Do you review everything line by line?
- Do you trust AI for boilerplate only, or also for business logic?
- What do you check first: correctness, security, performance, readability, or architecture?
- Do you use a checklist?
- How do you catch subtle bugs or bad assumptions?
My rough thinking is something like this:
- understand the code fully before keeping it
- verify logic against requirements
- test happy path and edge cases
- check security and performance concerns
- refactor to match project standards
- never merge code only because “it works”
I would really like to hear practical workflows from real developers and teams.
What is your process for reviewing AI-generated code?
Top comments (16)
I find a multi model review that runs in a cycle provides meaningfully better results. E.g. let's say claude wrote the code. I have gemini/gpt/claude all review it independently and then synthesize the results. I keep reviewing until the review output is no longer helpful / fixing issues. My human judgement is only needed to determine at what point the review output is counterproductive.
That is a smart approach. Cross-reviewing AI-generated code with multiple models seems like a practical way to reduce blind spots and catch different kinds of issues.
I also liked your point that human judgment is still essential, especially in deciding when more review stops adding value.
Would you be open to sharing your actual workflow or template for this? For example, how you structure the review cycle, what you ask each model to check, and what signals tell you it is time to stop iterating.
I think that would be really useful for people trying to turn this into a repeatable process instead of doing it ad hoc.
pretty straightforward-- here is 1 cycle:
Github Repo
Thanks for sharing this.
I will check it out. This seems interesting
I paste the AI code into the IDE, run it, and if it works, I check to see if all the requested features are implemented. If it doesn't work, I regenerate it. If it works but is missing a feature, I ask for it to be completed in another chat. If it works, I ask for explanations of the code, identify errors, and suggest improvements, again in a new chat.
My approach is little bit different i use AI primary to speed up the process on writing.
I want that AI write code the same as i was writing it, using my patterns and my mindset.
This is all documented in my instruction files or system prompt.
For example I force AI to use BMAD protocol Instruction file example
I understand that not all developers are experienced and many people today want only to transform an idea into a project. But i think that if you really want to be a developer you need to have the "control" on your code. If you want and are interested, do a little look at my blog post series that talk about my 40 years in IT Code is craft
I have a simple workflow. After every epoch with the AI, I ask it: "What shortcuts did you take? What quickfixes did you do? What did you defer? What decisions did you make without checking in with me first?"
So far, Claude has been responding really well with "Here's an honest accounting" reply outlining all those things. After it addresses those, then I ask that question again. And so on, until things are acceptable to me.
Great questions!
In my experience, some of this can be solved effectively with automated tooling, e.g. linting, enforcing coding and naming standards, running a unit test suite, etc.
Some of it can be addressed "upstream" with more detailed, less ambiguous specification. Personally, I like to use a behavior-driven development (BDD) approach with Gherkin's Given (pre-conditions) > When (system interaction) > Then (post-conditions) syntax and successful and unsuccessful Scenarios. You can see an example of this here.
All of this is great because it plays into tech's strengths and reduces the ambiguity that can plague human and AI code creation.
But validating business intent (what I think you mean by "business logic") is still the domain of people. The good news is that automating as much of the code review "grunt work" should free up time to carefully consider whether the code actually supports the business intent.
My favorite is "never merge code only because 'it works'". So important.
Keep asking these important questions and please keep sharing what you learn. Thank you!
I use a model driven kanban board with human verification (think QA). It already deslops as a step and then refactors based on embeddings that codify things from Robert Martin, Unix Philosophy, functional programming paradigms, and it does a damn fine job. I review code once in awhile by hand to ensure my systems work.
State machines powered by cli calls instead of letting the model do things. I turn the model into just a "voice in the head" and let my agent do the rest (deterministic gate).
I’m currently working on an early-stage project involving autonomous transaction flows between systems (Mindchain).
One thing that became clear is that AI-generated code needs to be treated as “untrusted by default”.
Even in a simple MVP, I found it useful to:
Curious to see how others are approaching this, especially as systems become more autonomous.
My process for reviewing AI-generated code is to first focus on critical sections like business logic and data handling. Thoroughly test the code with both normal and edge cases and question any assumptions the code makes. I dont trust AI blindly , I review complex and sensitive logic closely.
It looks like this. Verification can be auto or human required:
Some comments may only be visible to logged-in visitors. Sign in to view all comments.