CMPUT 651: Topics in Artificial Intelligence
Deep Learning for NLP
Instructor: Lili Mou
This course introduces deep learning (DL) techniques for natural language processing (NLP).
Contrary to other DL4NLP courses, we would have a whirlwind tour of
architectures (e.g., CNNs, RNNs, attention) in a few lectures. Then, we would make
significant efforts in learning structured prediction using Bayesian and Markov networks,
with applications of sequential labeling, syntactic parsing, and sentence generation. In this
process, we will also see how such traditional methods can be combined with and improve
a plain neural network.
No DL or NLP background is required. They will be self-contained.
- Basic math (including algebra, calculus, probability theory, and statistics)
- Background machine learning knowledge (e.g., logistic regression, softmax classification)
- Coding skills. The students should be able to write code in at least one programming
language, although the course project would be in python by default. The students should
also be able to implement algorithms by themselves, as well as making use of existing
packages (such as TensorFlow and PyTorch).
- Neural network basics
- Classification tasks and classifiers
- Naive Bayes, logistic regression, softmax, etc.
- Deep Neural Networks
- Forward and backward propagation
- Embeddings: Representing Discrete Words
- Representing Structured Input
- Bayesian Networks
- HMM for sequential labeling
- Markov Networks & Conditional Random Fields
- Discrete Latent Space
- Reinforcement Learning in NLP
- Neural Relaxation for RL
- Sentence Generation
- Variational Autoencoder
- Sampling and Stochastic Searching
01. NLP Tasks and Linear Classification [slides]
02. Deep Neural Network [slides]
03. Word Embeddings and Language Modeling [slides]
04. CNNs, RNNs, etc. [slides]
05. Seq2Seq Models and Attention Mechanism [slides]
06. hmm [slides]
07. em...hmm [slides]
08. MRF & CRF [slides]
09. Discrete Latent Variables [slides]
10. Sentence Generation [slides]
Note: Lectures for Part II were derived on the whiteboard.