RAG techniques for grounding language model outputs in financial documents and ledger data
View all tags
FLARE (EMNLP 2023) improves on standard RAG by triggering retrieval mid-generation using token-probability confidence thresholds, reaching 51.0 EM on 2WikiMultihopQA versus 39.4 for single-retrieval — but calibration failures in instruction-tuned chat models limit its reliability for production finance agents.