Running multi-agent systems in production is a nightmare once you hit recursive...
https://tiny-wiki.win/index.php/Agent_Evaluation:_Beyond_the_Vanity_Metrics_of_%22Accuracy%22
Running multi-agent systems in production is a nightmare once you hit recursive tool-call loops. These agents act like unmonitored distributed systems. Stop treating LLMs like magic and start tracking state before you get paged at 3 AM for a stalled chain