beyhangl/evalcraft
The pytest for AI agents — capture, replay, mock, and evaluate agent behavior. Agent Reliability Engineering.
0Active
On the radar — signal detected
Stars
3
Forks
0
Contributors
1
Language
Python
Score updated Mar 26, 2026
// SUBSCRIBE
The repos that moved this week, why they matter, and what to watch next. One email. No noise.