2 tagged with "Financial Literacy"

Research on financial knowledge representation and LLM competency

June 23, 2026·mike

LLMs Score 2.3% on Beancount DSL Generation: The LLMFinLiteracy Benchmark

The LLMFinLiteracy benchmark finds that five open-weight ~7B models generate fully correct Beancount transactions only 2.3% of the time, with failures concentrated in accounting reasoning—not syntax—pointing to compiler-in-the-loop feedback as the critical missing ingredient for reliable write-back agents.

llm

beancount

plain-text-accounting

April 18, 2026·mike

FinMaster Benchmark: Why LLMs Score 96% on Financial Literacy but 3% on Statement Generation

FinMaster (arXiv:2505.13533) benchmarks o3-mini, Claude 3.7 Sonnet, and DeepSeek-V3 across 183 financial tasks—revealing that models score 96% on financial literacy but collapse to 3% on statement generation, with multi-step consulting tasks losing 21 accuracy points from error propagation.

llm

accounting

financial-statements