Direct Preference Optimization: A Simpler Approach to Aligning Language Models - Rollup News