How to evaluate agents in practice | DailyDevLists