2 tagged with "Multi-Agent"
Multi-agent LLM frameworks and architectures for collaborative financial automation
·mike
M3MAD-Bench: Are Multi-Agent Debates Really Effective Across Domains and Modalities?
M3MAD-Bench stress-tests Multi-Agent Debate across 9 models, 5 domains, and vision-language settings, finding that Collective Delusion causes 65% of failures, adversarial debate cuts accuracy by up to 12.8%, and Self-Consistency typically matches debate accuracy at lower token cost.
ai
llm
machine-learning
automation
+3·mike
AutoGen: Multi-Agent Conversation Frameworks for Finance AI
AutoGen (Wu et al., 2023) introduces a multi-agent conversation framework where LLM-backed agents pass messages to complete tasks; a two-agent setup lifts MATH benchmark accuracy from 55% to 69%, and a dedicated SafeGuard agent improves unsafe-code detection by up to 35 F1 points — findings directly applicable to building safe, modular Beancount automation pipelines.
ai
llm
automation
beancount
+3