aimvik07/agent-eval
CLI toolkit for probing LLM agent failures, comparing models on cost vs accuracy, and catching regressions. Tested across classification, sentiment, and RAG agents.
0Active
On the radar — signal detected
Stars
1
Forks
0
Contributors
0
Language
Python
Score updated Jun 26, 2026
// SUBSCRIBE
The repos that moved this week, why they matter, and what to watch next. One email. No noise.