GIT_FEED

haizelabs/tourno

TournO (Tournament Optimization) combines pointwise and pairwise LLM judges to produce reward signals for RLHF, using tournament-style comparisons (round-robin, ELO) to derive scalar rewards from pairwise preferences.

View on GitHub
0Active

On the radar — signal detected

Stars
12
Forks
0
Contributors
3
Language
Python

Score updated Apr 7, 2026

// SUBSCRIBE

The repos that moved this week, why they matter, and what to watch next. One email. No noise.