All docs

Research

Sources, depth, and citation style

Three settings shape a run: which sources to search, how exhaustive to be, and how the references are formatted.

The settings

Research
compare RLHF and DPO…

Sources

WebAcademicNewsWikipedia

Depth

Run research

Report

Both methods align language models, but DPO skips the reward model1.

Sources · 12

arxiv.org · DPO paper
openai.com · RLHF
Toggle source typesDepth slider

Toggle Web, Academic, News, and Wikipedia (keep at least one on). The depth slider trades speed for breadth, from a quick handful of sources to an exhaustive sweep. Pick a citation style: MLA, APA, Chicago, Harvard, or IEEE.