Mastering Finetuning LLMs with RLHF in 4 weeks

4 Weeks

·

Cohort-based Course

A hands-on course on RLHF for finetuning LLM, with complete introduction to reinforcement learning, PPO and finetuning process .

Course overview

Mastering RLHF for Finetuning LLM

RLHF (Reinforcement Learning with Human Feedback) is the secret weapon behind ChatGPT and Llama. It is a crucial step to enhance the performance of an LLM. By the end of this workshop, you will gain complete understanding of RLHF, the models behind it such as PPO, reward function and supervised finetuning. You will gain the confidence of knowing the full process of RLHF, and how to implement them. 

Who is this course for

01

Technology managers who want to understand RLHF and the process of finetuning LLM, and gain hands-on experience.

02

Software engineers and data scientists who want to grow in their career and learn new skills. and gain hands-on experience.

03

Academics and students who want to learn about RLHF, the cutting edge solution for LLMs.

What you’ll get out of this course

Deep understanding of LLM finetuning

You will gain deep understanding of LLM finetuning, and end to end process.

Complete knowledge of RLHF

You will gain complete understanding of RLHF, the models behind it such as PPO, reward function and supervised finetuning.

Deep understanding of reinforcement learning and its use for LLM

You will get the complete knowledge of reinforcement learning.

Hands-on experience with RLHF 

You gain hands-on experience with RLHF, by working a real-world data, and solve the problem.

Course syllabus

Week 1

Jan 10—Jan 14

    Jan

    11

    Session 1

    Thu 1/113:00 AM—5:00 AM (UTC)

    Overview of LLM Finetuning

    1 item

    Supervised finetuning for LLM

    1 item

Week 2

Jan 15—Jan 21

    Jan

    18

    Session 2

    Thu 1/183:00 AM—5:00 AM (UTC)

    Reinforcement learning fundamentals

    1 item

    Deep Reinforcement Learning

    1 item

Week 3

Jan 22—Jan 28

    Jan

    25

    Session 3

    Thu 1/253:00 AM—5:00 AM (UTC)

    Introduction to PPO

    2 items

    Learning Reward Function

    1 item

Week 4

Jan 29—Feb 1

    Feb

    1

    Session 4

    Thu 2/13:00 AM—5:00 AM (UTC)

    Training LLM with PPO

    1 item

    Implement RLHF step by step

    1 item

    Tools for using RLHF

    1 item

Meet your instructor

Junling Hu

Junling Hu

Founder of Coach.ai

Junling Hu is the founder of Coach.ai, which provides LLM-powered conversational AI platform. Prior to that, Junling worked as Director of AI at Samsung, Director of AI at Live Person, and AI leader at PayPal. Junling is an expert in reinforcement learning and LLMs. She is the author of the book The Evolution of Artificial Intelligence. Junling received her Ph.D. in AI from U. of Michigan at Ann Arbor. You can find more information about her and her tutorial talks at YouTube: https://www.youtube.com/@aifmeetup

A pattern of wavy dots

Be the first to know about upcoming cohorts

Mastering Finetuning LLMs with RLHF in 4 weeks

Course Schedule

2-5 hours per week

  • Tue or Wed or Thur

    7 - 9pm or 12 - 2pm PST

    You can choose cohort for different time of the week: Tuesday evening, Thursday evening or Wed daytime. We meet through Zoom. Our meeting includes live lectures and working through live Python notebook.

  • Q&A and Additional materials

    3 hours per week

    You can ask the instructor any question during the week through email. In addition, there are optional homework that you can practice, which will help you to get deeper into the class materials.

Learning is better with cohorts

Learning is better with cohorts

Active hands-on learning

This course builds on live workshops and hands-on projects

Interactive and project-based

You’ll be interacting with other learners through breakout rooms and project teams

Learn with a cohort of peers

Join a community of like-minded people who want to learn and grow alongside you

Frequently Asked Questions

A pattern of wavy dots

Be the first to know about upcoming cohorts

Mastering Finetuning LLMs with RLHF in 4 weeks