GIT_FEED

aimvik07/agent-eval

CLI toolkit for probing LLM agent failures, comparing models on cost vs accuracy, and catching regressions. Tested across classification, sentiment, and RAG agents.

View on GitHub
0Active

On the radar — signal detected

Stars
1
Forks
0
Contributors
0
Language
Python

Score updated Jun 26, 2026

// SUBSCRIBE

The repos that moved this week, why they matter, and what to watch next. One email. No noise.