GIT_FEED

Mysticbirdie/image-cultural-accuracy-benchmark

Benchmark measuring historical accuracy of AI-generated images. 24 image pairs (3 characters × 8 scenes) set in Rome 110 CE, comparing naive prompts vs culturally-grounded prompts. Blinded A/B evaluation shows structured knowledge injection produces 5x more historically accurate images. Includes prompts, evaluation rubric, and reproducible pipeline

View on GitHub
0Active

On the radar — signal detected

Stars
2
Forks
0
Contributors
1
Language
Python

Score updated Mar 26, 2026

// SUBSCRIBE

The repos that moved this week, why they matter, and what to watch next. One email. No noise.