SAAS: Teaching Search Agents When to Stop Searching
SAAS uses self-aware RL to cut a Qwen2.5-7B search agent's average queries from 2.19 to 0.97 per question, while keeping accuracy near the best baseline (48.7% vs 49.8%).
Institution
A national public research university in Xiamen, Fujian, China; its School of Informatics is active in NLP and machine learning research (the XMUDeepLIT / Deep Learning lab).
SAAS uses self-aware RL to cut a Qwen2.5-7B search agent's average queries from 2.19 to 0.97 per question, while keeping accuracy near the best baseline (48.7% vs 49.8%).