Investing with an AI agent is one step, not the foundation
An AI agent like ChatGPT is one step of investing, and it is the chaotic one. It cannot tell true from imagined. The foundation underneath it has to be deterministic.
One step, not the whole process
Open ChatGPT, paste in a company name and ask it to walk you through the business. Within seconds you have a tidy summary, a list of risks and a view on valuation. For a lot of investors this has quietly become the first step of research. It feels like having a junior analyst on call at every hour, one that has read more than any person could and never tires of your questions.
That feeling is mostly earned, and it is also the trap. An AI agent is genuinely useful, but it is only one step of investing, and on its own it is the least reliable step. The value never comes from the agent alone. It comes from the foundation you build around it and the control you keep over it. Lose that control and the most fluent tool you have ever used becomes the most dangerous one.
The two layers of the decision look like this:
How these agents actually work
It helps to know what is happening under the surface, because the mechanism explains everything that follows. A system like ChatGPT is trained on an enormous amount of text. During training it learns one core skill: predict the next small piece of text (a token) given everything written so far. That is the whole engine. When you ask a question, the agent does not look the answer up in a database and read it back to you. It generates a reply one piece at a time, each piece chosen because it is statistically likely to follow what came before.
Two things follow from this, and both matter. First, there is no separate step where the system checks whether the result is true. Fluency is the product. Truth is a frequent by-product, not a guarantee. Second, the process is probabilistic, not deterministic. Ask the same question twice and you can get two different answers. A deterministic tool returns the same output every time you give it the same input. An AI agent does not, and that single difference is the reason it can never be the ground you stand on.
What they do well
Used as an assistant, an AI agent is excellent at a real set of tasks.
- Explaining concepts. Ask it what a term means, how a metric is built or why two ideas differ, and you get a clear, patient explanation at whatever depth you want. It is a strong teacher for the parts of investing that are settled knowledge.
- Summarizing and structuring. Hand it a long document, a transcript or a messy set of notes and it will compress and organize them. It turns a wall of text into something you can scan.
- Drafting and brainstorming. It is fast at first drafts, checklists and lists of things you might have missed. A blank page becomes a starting point.
- Thinking out loud with you. It is a tireless sounding board. You can test a line of reasoning, ask for the other side of an argument and pressure-test your own view, at any hour, without judgment.
Notice the common thread. The agent is strongest when the task is about language, structure and widely known ideas, and when you remain the one who checks the output.
It cannot tell true from imagined
Now the hard part. The biggest problem with an AI agent is not that it is sometimes wrong. Every source is sometimes wrong. The problem is that the machine does not distinguish true from false, or real from imagined. To the agent, a real revenue number and an invented one are the same kind of thing, a plausible piece of text to place next. That is why it cannot warn you. It has nothing to warn you with.
So when it knows, it sounds calm and specific, and when it does not know, it also sounds calm and specific. The wording, the structure and the confidence are identical. A human expert usually signals doubt. They hedge, they slow down, they say "I would have to check that". An AI agent rarely does this on its own. It will hand you a precise-looking number, a clean quote or an exact date in the same steady voice it uses for things it has seen ten thousand times. The error arrives wearing the same face as the truth.
This is measurable, not a hunch. In a 2024 Stanford study that put more than 800,000 verifiable legal questions to general-purpose systems, they returned a made-up answer between 58% and 88% of the time, and, just as telling, could not reliably sense when they were guessing. Law is only the most studied case. The traits that trip these systems up, dense terminology, frequent updates and narrow facts that barely appear in their training, describe finance just as well. OpenAI, the maker of ChatGPT, points the same way in its own published testing. On open questions about real people and facts, its recent systems gave a wrong answer between 16% and 33% of the time.
In investing this lands exactly where it hurts most.
- Recent prices and events. Unless it is actively using a live tool, the agent's knowledge has a cutoff. It may confidently describe a "current" price or a "latest" quarter that is months or years stale.
- Specific numbers. Revenue, margins, ratios, dates. These are the easiest things to get subtly wrong and the hardest to catch, because a wrong number looks exactly like a right one.
- Niche and small securities. The less that has been written about a company, the more the agent has to invent, and the more fluent that invention becomes.
- Sources and quotes. It can produce a citation, a study or a management quote that looks completely real and simply does not exist.
Who controls the context
Here is the part most people miss. What you get out of an AI agent depends almost entirely on the context you put in, and most people do not know what to put in. They ask a vague question, leave out the information that matters and accept whatever framing the machine offers back. They are not controlling the context. The context is controlling them.
This is where professionalism comes in, and it is worth being precise about what professionalism means here. It is not experience or confidence or a finance degree. It is the discipline of staying in charge of the exchange. The professional decides what information goes in, what question actually gets asked, what counts as an acceptable answer and what has to be checked before it is believed. The amateur lets the machine set all of those terms without noticing that a choice was even made.
When the context controls you, the machine takes charge. And a machine in charge is dangerous for one specific reason. It does not know what it does not know. It will lead you somewhere with complete confidence and no sense of whether the ground under that path is solid or imagined. You followed it because it sounded sure. It always sounds sure.
The same Stanford study found a second trap that matters even more here. The systems tended to accept a wrong assumption baked into the question rather than correct it. Ask something built on a false premise and you often get a confident answer built on the same false premise. If you do not control what goes in, your mistake goes in with it and comes back wearing the machine's authority.
The habit that keeps you in charge
Staying in control is not a talent. It is a procedure, and the procedure is simple.
- Ask narrow questions, not broad ones. A broad question ("tell me about this company") hands the agent maximum room to fill gaps. A narrow question ("what revenue did this company report in its last annual filing") gives you something specific you can go and verify. Specific questions produce checkable answers.
- Go step by step. Break the work into stages and confirm each one before moving on. Gather the business description, then the numbers, then the risks, then the comparison. Small steps keep you in charge of the context and make it obvious where an answer went wrong.
- Separate gathering from judging. Use the agent first to gather and organize information. Then verify that information yourself against the primary source, the company filing, the exchange or a data provider you trust. Only once the facts are confirmed should you move to analysis. Never analyze on top of numbers you have not checked.
- Make it show its work. Ask how it knows, ask for the source and ask what would change its answer. Treat a vague or missing source as a warning, not a detail. If it cannot tell you where a number came from, assume it may have invented it.
- Analyze together, but you decide. Once the facts are verified, the agent becomes valuable again as a thinking partner. Talk through what the numbers mean, ask it for the opposing view and let it stress-test your reasoning. Keep the final judgment yours. A fluent paragraph is not a decision.
Every one of these habits does the same job. It keeps you, not the machine, in control of the context.
The foundation has to be deterministic
Even done perfectly, all of that leaves the agent as what it is. One step, good for language and explanation and thinking out loud, and unfit to be the ground a decision rests on. Investing needs that ground, and the ground has to be deterministic. Deterministic means the same inputs give the same answer every time, and the numbers come from real data rather than fluent guesswork. That is what turns a pile of holdings into something you can actually reason about.
The evidence even points the way. When these systems are pinned to verified material you hand them and asked only to stay faithful to it, the best ones contradict that source less than 2% of the time, a small fraction of their error rate when they answer from memory. The chaos is not random. It shrinks the moment you control the input and ground the output in real data. Determinism is the same idea taken all the way.
This is where Souppe comes in. It is deterministic where the agent improvises, and grounded where the agent guesses. Give it the same portfolio and it returns the same reading every time, scored across the dimensions where a portfolio tends to fail, with each suggestion tied to the weakness it addresses and showing its expected effect before you commit. It is built to be efficient and genuinely useful, to bring order to the chaos rather than add to it. You can build a portfolio on that foundation and watch the read respond as you go in Souppe Studio, our new visual builder.
That is the shape of using these tools well. The AI agent is one step, the improvised one, and you keep it on a short leash. The foundation underneath, the part that tells you where you actually stand, is deterministic, checkable and yours to control. Let the machine help you think. Do not let it decide where the ground is.