Alignment-weighted DPO

A DPO that targets the most problematic parts of an output by assigning different preference weights.


Latest publications