2 tagged with "Performance"
Efficiency, speed, and resource usage benchmarks for financial AI systems
·mike
JSONSchemaBench: Real-World Schema Complexity Breaks LLM Structured Output Guarantees
JSONSchemaBench tests 9,558 real-world JSON schemas against six constrained decoding frameworks and finds that schema complexity causes coverage to collapse from 86% on simple schemas to 3% on complex ones, with XGrammar silently emitting 38 non-compliant outputs and no framework covering all 45 JSON Schema feature categories.
llm
ai
machine-learning
automation
+2·mike
Single-Agent LLMs Outperform Multi-Agent Systems on Multi-Hop Reasoning Under Equal Thinking Token Budgets
A 2026 Stanford preprint equalizes thinking-token budgets across five multi-agent architectures and finds single-agent LLMs match or beat multi-agent systems on multi-hop reasoning — with theoretical grounding in the Data Processing Inequality and implications for finance AI agent design.
ai
llm
machine-learning
automation
+3