Mastering Finetuning LLMs with RLHF in 4 weeks

3 weeks

Cohort-based Course

A hands-on course on RLHF for finetuning LLM, with complete introduction to reinforcement learning, PPO and finetuning process .

Mastering Finetuning LLMs with RLHF in 4 weeks

3 weeks

Cohort-based Course

A hands-on course on RLHF for finetuning LLM, with complete introduction to reinforcement learning, PPO and finetuning process .

Hosted by

Junling Hu

Founder | AI Expert | AI Technology leader | AI Book author

Junling Hu

Founder | AI Expert | AI Technology leader | AI Book author

Course overview

Mastering RLHF for Finetuning LLM

RLHF (Reinforcement Learning with Human Feedback) is the secret weapon behind ChatGPT and Llama. It is a crucial step to enhance the performance of an LLM. By the end of this workshop, you will gain complete understanding of RLHF, the models behind it such as PPO, reward function and supervised finetuning. You will gain the confidence of knowing the full process of RLHF, and how to implement them.

Who is this course for

Technology managers who want to understand RLHF and the process of finetuning LLM, and gain hands-on experience.

Software engineers and data scientists who want to grow in their career and learn new skills. and gain hands-on experience.

Academics and students who want to learn about RLHF, the cutting edge solution for LLMs.

Technology managers who want to understand RLHF and the process of finetuning LLM, and gain hands-on experience.

Software engineers and data scientists who want to grow in their career and learn new skills. and gain hands-on experience.

Academics and students who want to learn about RLHF, the cutting edge solution for LLMs.

What you’ll get out of this course

Deep understanding of LLM finetuning

You will gain deep understanding of LLM finetuning, and end to end process.

Complete knowledge of RLHF

You will gain complete understanding of RLHF, the models behind it such as PPO, reward function and supervised finetuning.

Deep understanding of reinforcement learning and its use for LLM

You will get the complete knowledge of reinforcement learning.

Hands-on experience with RLHF

You gain hands-on experience with RLHF, by working a real-world data, and solve the problem.

Course syllabus

4 live sessions • 10 lessons

Week 1

Jan 10—Jan 14

Jan

Session 1

Thu 1/113:00 AM—5:00 AM (UTC)

Overview of LLM Finetuning

1 item

Supervised finetuning for LLM

1 item

Week 2

Jan 15—Jan 21

Jan

Session 2

Thu 1/183:00 AM—5:00 AM (UTC)

Reinforcement learning fundamentals

1 item

Deep Reinforcement Learning

1 item

Week 3

Jan 22—Jan 28

Jan

Session 3

Thu 1/253:00 AM—5:00 AM (UTC)

Introduction to PPO

2 items

Learning Reward Function

1 item

Week 4

Jan 29—Feb 1

Feb

Session 4

Thu 2/13:00 AM—5:00 AM (UTC)

Training LLM with PPO

1 item

Implement RLHF step by step

1 item

Tools for using RLHF

1 item

Meet your instructor

Junling Hu

Founder of Coach.ai

Junling Hu is the founder of Coach.ai, which provides LLM-powered conversational AI platform. Prior to that, Junling worked as Director of AI at Samsung, Director of AI at Live Person, and AI leader at PayPal. Junling is an expert in reinforcement learning and LLMs. She is the author of the book The Evolution of Artificial Intelligence. Junling received her Ph.D. in AI from U. of Michigan at Ann Arbor. You can find more information about her and her tutorial talks at YouTube: https://www.youtube.com/@aifmeetup

Be the first to know about upcoming cohorts

Mastering Finetuning LLMs with RLHF in 4 weeks

Course Schedule

2-5 hours per week

Tue or Wed or Thur
7 - 9pm or 12 - 2pm PST
You can choose cohort for different time of the week: Tuesday evening, Thursday evening or Wed daytime. We meet through Zoom. Our meeting includes live lectures and working through live Python notebook.
Q&A and Additional materials
3 hours per week
You can ask the instructor any question during the week through email. In addition, there are optional homework that you can practice, which will help you to get deeper into the class materials.

Learning is better with cohorts

Active hands-on learning

This course builds on live workshops and hands-on projects

Interactive and project-based

You’ll be interacting with other learners through breakout rooms and project teams

Learn with a cohort of peers

Join a community of like-minded people who want to learn and grow alongside you

Frequently Asked Questions

Be the first to know about upcoming cohorts