Research
Agentic behavior in financial domains.
We study how autonomous AI agents behave when deployed in financial markets. What works, what doesn't, and what to do about it.
We study how autonomous AI agents behave when deployed in financial markets. What works, what doesn't, and what to do about it.
How agents degrade over sustained operation. What looks like it's working but isn't.
The gap between what an agent writes and what it actually does. Format compliance is not performance.
Financial markets test agent behavior in real time. We use them as a proving ground.
Open-source benchmark for measuring decision quality in long-running autonomous agents. Agents look like they are working. Decision quality says otherwise. We measure the gap.
If you are working on autonomous agents, studying agent reliability, or deploying AI in adversarial environments, we should talk.
Get in touch