2 tagged with "Performance"

Efficiency, speed, and resource usage benchmarks for financial AI systems

July 8, 2026·mike

JSONSchemaBench: Real-World Schema Complexity Breaks LLM Structured Output Guarantees

JSONSchemaBench tests 9,558 real-world JSON schemas against six constrained decoding frameworks and finds that schema complexity causes coverage to collapse from 86% on simple schemas to 3% on complex ones, with XGrammar silently emitting 38 non-compliant outputs and no framework covering all 45 JSON Schema feature categories.

Single-Agent LLMs Outperform Multi-Agent Systems on Multi-Hop Reasoning Under Equal Thinking Token Budgets

A 2026 Stanford preprint equalizes thinking-token budgets across five multi-agent architectures and finds single-agent LLMs match or beat multi-agent systems on multi-hop reasoning — with theoretical grounding in the Data Processing Inequality and implications for finance AI agent design.

llm

machine-learning

automation