Research
Sources, depth, and citation style
Three settings shape a run: which sources to search, how exhaustive to be, and how the references are formatted.
The settings
Research
compare RLHF and DPO…
Sources
WebAcademicNewsWikipedia
Depth
Run research
Report
Both methods align language models, but DPO skips the reward model1.
Sources · 12
arxiv.org · DPO paper
openai.com · RLHF
Toggle source typesDepth slider
Toggle Web, Academic, News, and Wikipedia (keep at least one on). The depth slider trades speed for breadth, from a quick handful of sources to an exhaustive sweep. Pick a citation style: MLA, APA, Chicago, Harvard, or IEEE.