GIT_FEED

benmeryem-tech/llm-eval-kit

A lightweight, modular toolkit for evaluating and benchmarking Large Language Models with focus on reasoning quality, consistency, and error detection.

View on GitHub
0Active

On the radar — signal detected

Stars
3
Forks
0
Contributors
0
Language
Python

Score updated May 12, 2026

// SUBSCRIBE

The repos that moved this week, why they matter, and what to watch next. One email. No noise.