I had a refactor I'd been sitting on. The scope was clear enough to feel large, not clear enough to start. Today I tried something different.

I started by jotting down my ideas in a rough notes document, just stream of consciousness, no structure. Then I asked Claude to look over the codebase and use those notes to build a brainstorm.md with three sections: Current State, New Ideas, and Open Questions. The goal was to have something concrete I could share and get feedback on.

Then I uploaded that document to Gemini, ChatGPT, and Grok — one at a time — and asked each for their thoughts. After each response, I pasted the feedback back into the document before moving to the next model. By the end, brainstorm.md had accumulated a layer of commentary from three different sources.

Finally, I brought the whole thing back to Claude Opus and asked it to distill everything into a proper feature proposal.

The most interesting part was how different each model's feedback was. Gemini was surprisingly terse, less useful than I expected given it's usually pretty strong at this kind of analysis. ChatGPT was verbose, but the substance was there if you dug through it. Grok was the standout: good, direct feedback and it came back with questions, which pushed me to think harder about the parts I'd glossed over.

The workflow is rough. The prompt I used at each stage was pretty generic:

"I have uploaded a document which has brainstorming ideas for a refactor and feedback from other models. Can you offer your thoughts and critiques on the ideas in this document."

I've been using the AI SDK from Vercel in another app I'm building and I'm thinking about building a skill for this process: the compounding benefits of each AI feeding off the previous and the different perspectives might be worth pursuing.