In this episode we discuss the paper "Training language models to follow instructions with human feedback" by Ouyang et al (2022). We discuss the RLHF paradigm and how important RL is to tuning GPT.
AI Summary coming soon
Sign up to get notified when the full AI-powered summary is ready.
Free forever for up to 3 podcasts. No credit card required.
Free AI-powered recaps of Argmax and your other favorite podcasts, delivered to your inbox.
Free forever for up to 3 podcasts. No credit card required.