4 Weeks
·Cohort-based Course
Transform yourself from a Python Developer to an NLP Data Scientist in 4 weeks with practical projects.
4 Weeks
·Cohort-based Course
Transform yourself from a Python Developer to an NLP Data Scientist in 4 weeks with practical projects.
Course overview
You will become confident to take ownership of NLP projects at work and deliver them end-to-end with the unique product-based learning approach taught in the course.
01
Recent graduates or about to graduate students, who want to start their career as an NLP Data Scientist
02
Seasoned working professionals who want to transition and establish themselves as NLP developers
03
AI enthusiasts who want to dabble with NLP and explore the potential for themselves
NLP product for Movie Production House with Streamlit
The course starts with a problem statement of building an NLP product for a movie production house that includes features like finding similar movies, characters, etc.
We will introduce NLP techniques progressively and build features with Streamlit for frontend visualization.
Vectorizing documents with TF-IDF
Intuitively derive the logic of calculating the importance of a word which would later be called TF-IDF (Term Frequency & Inverse Document Frequency).
Applications using TF-IDF: Keyword extraction, summarization, and NLP recommendation systems.
Vectorizing documents with transformers
Convert words and sentences into vectors with cutting-edge algorithms like sentence transformers.
Applications using sentence transformers: Keyword extraction, Topic Modeling, and NLP recommendation systems.
See significant improvement when compared to TF-IDF.
GPT-3 and production deployment
We will look at how far NLP has progressed with language models like GPT-3 and explore various use-cases with it.
We will cover deployment by deploying a GPT-3 based application via an API that anybody can use in their projects.
Practical Introduction to NLP
Lavanya Gupta
Olaifa Julius 'Tunde
Be the first to know about upcoming cohorts
Ramsri is a Lead Data Scientist with 8+ years of work experience at startups and large corporations across Silicon Valley, Singapore, and India.
Most recently he has been a co-founder and CTO of a VC-backed NLP startup, Aurora: AI-Assisted Assessments.
Also, Ramsri is very active on Social Media sharing his NLP wisdom. Join Ramsri's 45k+ followers across various platforms.
01
Gentle introduction to NLP
Introduction to the definition of Natural Language Processing and its various sub-topics with practical examples.
We will introduce a practical problem statement of building an NLP-powered application for a movie production house and look at various components of it.
We will learn new NLP techniques and build features to the Streamlit app as we go
02
Build our own dataset
We will collect our dataset of Hollywood movie plots using the BeautifulSoup library.
This will give you confidence that you don't need to always look for existing datasets and building datasets from scratch is not all that hard.
We will also look at various options to collect datasets even without code, with scraping tools like Parsehub, etc.
03
Vectorizing documents with TF-IDF
We will build intuition to calculate the importance of a given word in a given movie plot which would later be called TF-IDF.
We will convert movie plots into vectors through which we will extract important keywords in a movie plot, summarize the movie plot, and also find similar movie plots using the TF-IDF algorithm.
04
Short student project 1: Calculate diversity using N-grams
We will use a pre-trained paraphraser that generates multiple alternate paraphrase sentences to a given sentence.
The goal is to find the most diverse sentence to the original sentence.
05
Visualize movie plots
We will get introduced to sentence transformers that can convert a word, sentence, or a full document into a vector.
We will create 2-dimensional vectors from movie plot vectors and visualize them using the Plotly library.
We will explore the power of visualizations by plotting some descriptions of directors & movie plots on the same visualization
06
Practical Project: Adapt a movie plot into another country/locale.
We will introduce word vectors and develop a practical project to convert Hollywood movie scripts to a different local/region.
Many times popular Hollywood movies are adapted and remade in countries like China and India.
We will see how we can use named entity recognition and word vectors to localize the names and locations to a different country.
07
Vectorizing documents using sentence transformers
We will use sentence transformers to convert movie plots into vectors.
We will revisit the topics of keyword extraction and similar movie plot retrieval from TF-IDF lecture and perform the same with sentence transformers and see significant improvement in the results when compared to TF-IDF.
08
Short student project 2: Evaluate short answers using NLP
Given an open-ended short descriptive answer that students will provide to a given question, we will see how we can evaluate them using traditional word matching as well as semantic matching using sentence transformers.
09
Identify the genre from a movie plot
We will use topic modeling to identify different topics (genres) from the movie plots using sentence transformers.
We will take any new movie plot and associate it with one of the topics we created using topic modeling.
10
GPT-3 and production deployment
We will get introduced to advanced language models like GPT-3 and understand how SaaS businesses are built with GPT-3.
We will deploy an API with GPT-3 and use it with our Streamlit app.
Be the first to know about upcoming cohorts
Snippet from the 2 hr weekend Masterclass
Active learning, not passive watching
This course focuses on live workshops and hands-on projects
Learn with a cohort of peers
You’ll be learning in public through breakout rooms and an engaged community
Be part of a life-long community
You will be part of a slack channel with like-minded people in the NLP space where you can ask questions, get career guidance and grow with a peer group.
Be the first to know about upcoming cohorts