CMPUT 651: Topics in Artificial Intelligence

Deep Learning for NLP

Fall 2019

Instructor: Lili Mou

Course Description

This course introduces deep learning (DL) techniques for natural language processing (NLP).
Contrary to other DL4NLP courses, we would have a whirlwind tour of all neural
architectures (e.g., CNNs, RNNs, attention) in a few lectures. Then, we would make
significant efforts in learning structured prediction using Bayesian and Markov networks,
with applications of sequential labeling, syntactic parsing, and sentence generation. In this
process, we will also see how such traditional methods can be combined with and improve
a plain neural network.

Prerequisites

Basic math (including algebra, calculus, probability theory, and statistics)
Background machine learning knowledge (e.g., logistic regression, softmax classification)
Coding skills. The students should be able to write code in at least one programming
language, although the course project would be in python by default. The students should
also be able to implement algorithms by themselves, as well as making use of existing
packages (such as TensorFlow and PyTorch).

No DL or NLP background is required. They will be self-contained.

Syllabus

Neural network basics

Classification tasks and classifiers

Naive Bayes, logistic regression, softmax, etc.

Deep Neural Networks

Forward and backward propagation

Embeddings: Representing Discrete Words
Representing Structured Input

CNNs, RNNs, attention

Structured Prediction

Bayesian Networks

HMM for sequential labeling

Markov Networks & Conditional Random Fields

Advanced Topics

Discrete Latent Space

Reinforcement Learning in NLP
Neural Relaxation for RL

Sentence Generation

Variational Autoencoder
Sampling and Stochastic Searching

Lectures

01. NLP Tasks and Linear Classification [slides]
02. Deep Neural Network [slides]
03. Word Embeddings and Language Modeling [slides]
04. CNNs, RNNs, etc. [slides]
05. Seq2Seq Models and Attention Mechanism [slides]
06. hmm [slides]
07. em...hmm [slides]
08. MRF & CRF [slides]
09. Discrete Latent Variables [slides]
10. Sentence Generation [slides]

Note: Lectures for Part II were derived on the whiteboard.