I’ve spent the last eight years in product operations and data analysis, often working with teams in the Belgrade startup ecosystem. If there is one thing I’ve learned, it’s that an investment thesis is rarely killed by bad data. It’s killed by the ego of the person who wrote it. You fall in love with the TAM, you gloss over the churn metrics, and you ignore the competitive moat—or lack thereof.
Most investors treat AI as an oracle. They ask GPT or Claude, "What do you think of this thesis?" and take the output as gospel. That is not decision intelligence; that is an echo chamber. To do high-stakes work, you need structured disagreement. You need to orchestrate multiple models to hunt for your blind spots.
The Fallacy of the Single-Model Analysis
When you use one model to validate your thesis, you get a reflection of that model's Visit this site training bias. GPT-4o might favor structure and process-heavy arguments. Claude 3.5 Sonnet might lean into nuance and creative synthesis. If you only talk to one, you aren’t testing your thesis; you’re just training your confirmation bias.
Multi-model orchestration isn't just a buzzword. It’s the practice of running your thesis through two distinct "cognitive architectures" simultaneously. When those two models disagree, that is where the value lives. That tension is where you find the risk factors that could cost you your shirt.
The "Founded Date" Problem: When Data Obfuscation Kills Accuracy
Before you run your thesis through any model, you have to acknowledge a fundamental truth: AI tools do not have real-time access to the private data rooms you’re likely looking at. Worse, they often struggle with public data hygiene.
Take Crunchbase, for example. If you are scraping or asking an LLM to look at a URL, you will eventually hit the "Founded Date" wall. In many cases, the founding date is obfuscated or formatted inconsistently on the front-facing page. If you are relying on Crunchbase Pro data, you are likely looking at a refined, verified timestamp. But if your agent is just scraping the public-facing UI, it might be hallucinating the company’s age based on its first funding round or a social media profile.

Never assume the AI knows when a company started. If your thesis relies on "first-mover advantage," verify the founding date manually. If you don't, you’re building your risk assessment on a foundation of sand.
The Framework: Structured Thesis Critique
To move from "AI as a toy" to "AI as an analyst," you need to stop asking "what do you think." AI for strategy teams Start using targeted thesis critique prompts, bear case prompts, and assumption stress tests.
I recommend using Suprmind or a similar orchestration layer to route these tasks to both GPT and Claude at the same time. Here is the structured approach I use to break my own logic.
1. The Assumption Stress Test
Break your thesis into its constituent parts. Don't ask the model to review the whole thing at once. Force it to decompose your claims.
Prompt Template:
"I am analyzing [Company Name]. My core thesis is: [Insert Thesis]. Break this thesis into five distinct underlying assumptions (e.g., market growth, retention, CAC/LTV). For each, identify a data point that, if proven wrong, would invalidate the entire thesis."
2. The Bear Case Prompt
Most analysts are too polite. You need the model to be aggressive. You aren't looking for polite feedback; you are looking for the kill switch.
Prompt Template:
"Assume the role of a short-seller who wants this company to fail. Build a bear case based on current market trends and the company’s publicly available metrics from Crunchbase. Focus on why the current valuation is decoupled from reality. What specific risks are we ignoring?"
3. Disagreement Detection
This is where multi-model orchestration wins. By asking both models to critique the same thesis, you create a dataset of disagreements.
Prompt Template:

"Critique the following thesis. [Insert Thesis]. After you provide your critique, I will run the same prompt through a different model. Your goal is to identify the most significant vulnerability in this argument."
Comparative Analysis: Orchestrating the Output
When you orchestrate these models, you need a way to visualize the output. I keep a simple matrix to map their responses. This forces me to look at the divergence between them.
Model Primary Focus Risk Sensitivity Hallucination Risk GPT-4o Structural Logic High (on process) Moderate (prefers data over facts) Claude 3.5 Sonnet Nuance/Context High (on human factors) Low (higher coherence)If GPT tells you the thesis is solid on unit economics, but Claude warns that the "founded date" data suggests the product hasn't survived a full market cycle, you have a signal. You don't have a definitive answer, but you have a path for due diligence.
Why You Must Account for "Unknowns"
There are things these models simply cannot see. They cannot see the internal team friction at the startup you’re looking at. They cannot see the email from a CTO saying they are burning out. They cannot see the true state of a pivot that isn't yet announced.
Stop asking the AI to tell you if a company is a good investment. It is a tool for logic testing, not a financial advisor. Use it to find where your logic is thin. Use it to find where your bear case is weak. Use it to surface questions you were too lazy to ask.
Final Thoughts: The "Ops" Mindset
In Belgrade, the startup culture is lean. We don't have the luxury of spending weeks on a deal that dies in the final hour. Using tools like Suprmind to orchestrate GPT and Claude allows me to perform 80% of the initial due diligence in minutes rather than hours.
However, the moment you stop questioning the model is the moment you become the product. Never accept the summary. Always ask for the source—and if the source is an obfuscated field on a public profile page, disregard the output entirely.
Go back to your thesis. Break it. If it survives a multi-model bear case, maybe—just maybe—you have something worth putting capital into.