When Five AIs Agree on Something Impossible
I ran a multi-model consortium for revenue planning and every AI proposed the same thing: crop insurance for cannabis. The problem? Cannabis is federally Schedule I. Crop insurance does not exist.
I run a multi-model consortium. Five LLMs, same prompt, independent responses. The idea is simple: if multiple models converge on the same answer, you've got signal. If they diverge, you dig deeper.
Last week I asked the consortium to generate revenue expansion plays for a cultivation technology company in cannabis. Standard strategic exercise. What new revenue streams could the platform unlock?
All five models, independently, proposed the same thing: insurance partnerships. Use sensor data to reduce crop insurance premiums. Partner with underwriters. Build an InsurTech vertical.
Sounds smart. Sounds like convergence. Sounds like signal.
There's one problem. Cannabis is federally classified as Schedule I in the United States. Crop insurance does not exist for cannabis. You cannot insure an illegal crop. Full stop.
Consensus is not correctness
Five models trained on the same internet arrived at the same wrong answer. That's not a coincidence. It's a feature of how these systems work.
LLMs are pattern matchers trained on text. The pattern "sensor data + agriculture + insurance" is well-represented in their training data because it's a real and profitable play in legal agriculture. The models don't know that cannabis occupies a unique regulatory position where this pattern breaks. They see the structural similarity and run with it.
The result feels authoritative. Five independent models agreeing creates a strong consensus signal. In most contexts, that kind of convergence would increase your confidence. Here, it should have decreased it, because the models are all drawing from the same well of training data, which means they share the same blind spots.
The domain expertise problem
This is the part that should concern anyone using AI for strategic planning.
The models didn't hedge. They didn't flag the regulatory complexity. They presented insurance partnerships as a viable, even obvious, revenue play. If I hadn't spent years operating in cannabis, if I hadn't personally dealt with the insurance gap that every cultivator faces, I would have taken this output and started building a business case around it.
That's the danger. AI is most confidently wrong about things that look structurally similar to things it knows but operate under different rules. Cannabis insurance looks like agricultural insurance. SaaS pricing in cannabis looks like SaaS pricing anywhere else, until you account for 280E tax treatment. Market expansion looks straightforward until you realize every state is a separate regulatory regime.
The model doesn't know what it doesn't know. And when five models don't know the same thing, the consensus feels like validation.
What I actually do about it
I recorded this as a lesson in Aianna, the memory system that sits underneath my AI stack. Every future prompt that touches cannabis revenue planning will have this lesson injected into context: "Cannabis cannot get crop insurance. Federal Schedule I status. Do not include insurance-related revenue plays."
That's the value of a memory layer. The LLM forgets. The system doesn't.
But more importantly, this experience reinforced something I already believed: AI is a force multiplier for domain expertise, not a replacement for it. The consortium is useful precisely because I can catch the errors. An operator who doesn't know the domain would take the consensus at face value and waste months chasing a play that can't exist.
The real takeaway
If you're using LLMs for strategic planning, multi-model consensus gives you one thing: a hypothesis. It does not give you a conclusion.
The models converge on patterns from training data. When your domain has regulatory, structural, or economic realities that diverge from common patterns, the consensus will be wrong. And it will be wrong confidently, unanimously, and convincingly.
Your job as the operator is to be the filter. The AI generates. You validate. When all five models agree, that's the moment to ask the hardest question: what do they all not know?