15: InstructGPT

March 28, 2023·57 min

Episode Description from the Publisher

In this episode we discuss the paper "Training language models to follow instructions with human feedback" by Ouyang et al (2022). We discuss the RLHF paradigm and how important RL is to tuning GPT.

AI Summary coming soon

Get Free Summaries →

Free forever for up to 3 podcasts. No credit card required.

Listen to This Episode

Apple Podcasts

More from Argmax

Mixture of Experts

October 8, 2024·54 min

LoRA

September 2, 2023·1h 2m

14: Whisper

March 17, 2023·49 min

13: AlphaTensor

March 11, 2023·49 min

View all episodes →

Get summaries like this every morning.

Free AI-powered recaps of Argmax and your other favorite podcasts, delivered to your inbox.

Get Free Summaries →

Free forever for up to 3 podcasts. No credit card required.