Reading Group (+๐ง): JUDGEMENTBENCH: Comparing Rubric and Preference Evaluation for Quality Assessment
About This Event
Join the Snorkel AI Reading Group, a recurring forum to explore the latest frontier developments in AI while building meaningful connections within the community.
In this afternoon session, Russell Yang, an AI Engineering Fellow at Stanford Law School, will cover his recent paper: JudgmentBench: Comparing Rubric and Preference Evaluation for Quality Assessment.
Agenda:
3pm - doors open
3:30pm - talk begins
๐ง๐ง๐ง Boba tea and other refreshments will be provided ! ๐ง๐ง๐ง
Among other things, you'll learn:
What JudgmentBench is: 30 real-world legal tasks paired with 1,539 rubric scores and 1,530 pairwise preference judgments, all collected from practicing attorneys (including at major U.S. law firms).
Why it's the first public dataset in a high-expertise domain where both supervision signals are elicited from the same experts on the same items.
Why the choice between rubric scoring and comparative judgment is rarely justified, even though both dominate current benchmarking.
How comparative judgments recover the intended quality ordering far better than rubrics: a mean Spearman correlation of 0.908 vs. 0.150, while requiring less than half the annotation time.
Why that pattern holds for both human annotators and LLM autograders.
How the paired dataset opens a broader research agenda on how expert judgment should be elicited, aggregated, and used as supervision in domains without verifiable ground truth.
JudgmentBench is a collaboration between Stanford, Harvey AI, and Snorkel AI.
Location
๐ 101 Second Street, San Francisco, CA 94105, USA
Get a free growth analysis for your company
See how your website, messaging, and go-to-market strategy stack up, in minutes.
Get My Free AnalysisMore SF Events You Might Like
AI Engineer World's Fair
The premier industry gathering for AI engineers, offering unparalleled access to the leading edge of...
-1 to Snowflake with Sridhar Ramaswamy
A marquee event featuring the CEO of Snowflake at a top-tier technical community, essential for seri...
Agentic AI Summit
A premier summit with an elite speaker list from top labs and funds, essential for the AI engineerin...