Scientific Judge
paper datasetIntroduced Reinforcement Learning from Community Feedback (RLCF) for aligning AI with scientific reasoning, accompanied by a dataset of 700,000 scientific preference signals.
Outputs 2
Scientific Judge
paperIntroduced Reinforcement Learning from Community Feedback (RLCF) for aligning AI with scientific reasoning.
Scientific Judge Dataset
datasetDataset of 700,000 scientific preference signals for alignment research.