Show HN: Improving RAG with chess Elo scores

Description

Other

Show HN: Improving RAG with chess Elo scores Hello HN,

I'm Ghita, co-founder of ZeroEntropy (YC W25). We build high accuracy search infrastructure for RAG and AI Agents.

We just released two new state-of-the-art rerankers zerank-1, and zerank-1-small. One of them is fully open-source under Apache 2.0.

We trained those models using a novel Elo score inspired pipeline which we describe in detail in the blog attached. In a nutshell, here is an outline of the training steps: * Collect soft preferences between pairs of documents using an ensemble of LLMs. * Fit an ELO-style rating system (Bradley-Terry) to turn pairwise comparisons into absolute per-document scores. * Normalize relevance scores across queries using a bias correction step, modeled using cross-query comparisons and solved with MLE.

You can try the models either through our API (https://docs.zeroentropy.dev/models), or via HuggingFace (https://huggingface.co/zeroentropy/zerank-1-small).

We would love this community's feedback on the models, and the training approach. A full technical report is also going to be released soon.

Thank you!

Show HN: Bottlefire – Build single-executable microVMs from Docker images

Bypass PostgreSQL catalog overhead with direct partition hash calculations

Automatically Packaging a Haskell Library as a Swift Binary XCFramework

Show HN: Unlearning Comparator, a visual tool to compare machine unlearning

Show HN: Unlearning Comparator, a visual tool to compare machine unlearning I built Unlearning Comparator, a visual analytics toolkit to help researchers and developers compare how different machine unlearning methods work. It provides a unified workflow to test for accuracy, efficiency, and privacy. You can check out the live demo linked in the post, and the source code is on GitHub: <a href="https://github.com/gnueaj/Machine-Unlearning-Comparator">https://github.com/gnueaj/Machine-Unlearning-Comparator</a> Our accompanying paper is currently under review at IEEE TVCG. Happy to answer any questions and would love to hear your feedback!