Understanding reinforcement learning from human, or AI, feedback.
US AI Safety Institute signs voluntary agreement with OpenAI and Anthropic for model testing. Effectiveness uncertain; challenges in transparency, expertise, and implementation.