Skip to content

Policy vs. reference model gradient updates in ch07 on DPO #472

tt7533 started this conversation in General
Discussion options

You must be logged in to vote

Replies: 2 comments 3 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
3 replies
@rasbt
Comment options

rasbt Jan 7, 2025
Maintainer

@tt7533
Comment options

@rasbt
Comment options

rasbt Jan 8, 2025
Maintainer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants