Reinforcement learning from human feedback (RLHF) is a technique for tra...
Policies often fail due to distribution shift – changes in the state and...
To act in the world, robots rely on a representation of salient task asp...
As robots are increasingly deployed in real-world scenarios, a key quest...
AI's rapid growth has been felt acutely by scholarly venues, leading to
...
Although systematic biases in decision-making are widely documented, the...
We propose incorporating human labelers in a model fine-tuning system th...