MR3: Multilingual rubric-agnostic reward reasoning models

A multilingual, rubric-agnostic reward reasoning model achieving the broadest language coverage in reward modeling to date.


Latest publications