TTIC 31230: Fundamentals of Deep Learning, Winter 2019

Pretraining

Slides

Problems

Supervised Pretraining for Vision:

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , Girshick at al., 2013

Exploring the Limits of Weakly Supervised Pretraining, Mahajan et al., May 2018

Rethinking ImageNet Pretraining, He et al., Nov. 2018

Unsupervised Pretraining for Vision:

Context Encoders: Feature Learning by Inpainting, Pathak et al., April 2016

Learning Representations for Automatic Colorization, Larsson et al., March 2016

Representation Learning with Contrastive Predictive Coding (CPC), van den Oord et al, July 2018

Learning Deep Representations by Mutual Information Estimation and Maximization, Hjelm et al., Aug. 2018

Difference of Entropies Mutual Information Pretraining:

Information Theoretic Co-Training, McAllester, Feb. 2018

Formal Limitations on the Measurement of Mutual Information, McAllester and Stratos, Nov. 2018

Unsupervised Pretraining for Language:

Efficient Estimation of Word Representations in Vector Space (Word2Vec), Mikolov et al., 2013

Advances in Pre-Training Distributed Word Representations (FastText, BPE), Mikolov et al., 2017

Attention is All You Need (The Transformer), Vaswani et al., June 2017

Deep contextualized word representations (ELMO), Peters et al., Feb. 2018

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, (BERT) Devlin et al., Oct. 2018

Language Models are Unsupervised Multitask Learners, (GPT-2) Radford et al., Feb. 2019

Unsupervised Machine Translation (UMT):

Unsupervised Nerual Machine Translation, Artexte et al., Oct. 2017

Unsupervised Machine Translation Using Monolingual Corpora Only, Lample et al., Oct. 2017

An Effective Approach to Unsupervised Machine Translation, Artexte et. al., Feb. 2019