A Guide to Deep Learning

马嘉勋

2023-12-01

A Guide to Deep Learning

Deep learning is a fast-changing field at the intersection of computer science and mathematics. It is a relatively new branch of a wider field called machine learning. The goal of machine learning is to teach computers to perform various tasks based on the given data. This guide is for those who know some math, know some programming language and now want to dive deep into deep learning.

This is not a guide to:
• general machine learning
• big data processing
• data science
• deep reinforcement learning

Prerequisites

You must know standard university-level math. You can review those concepts in the first chapters of the bookDeep learning:

You must know programming to develop and test deep learning models. We suggest using Python for machine learning. NumPy/SciPy libraries for scientific computing are required.

Justin Johnson's Python / NumPy / SciPy / Matplotlib tutorial for Stanford's CS231n ★
Scipy lecture notes - cover commonly used libraries in more details and introduce more advanced topics ★★

When you are comfortable with the prerequisites, we suggest four options for studying deep learning. Choose any of them or any combination of them. The number of stars indicates the difficulty.

Hugo Larochelle's video course on YouTube. The videos were recorded in 2013 but most of the content is still fresh. The mathematics behind neural networks is explained in detail. Slides and related materials are available. ★★
Stanford's CS231n (Convolutional Neural Networks for Visual Recognition) by Fei-Fei Li, Andrej Karpathy and Justin Johnson. The course is focused on image processing, but covers most of the important concepts in deep learning.Videos (2016) and lecture notes are available. ★★
Michael Nielsen's online bookNeural networks and deep learning is the easiest way to study neural networks. It doesn't cover all important topics, but contains intuitive explanations and code for the basic concepts. ★
Deep learning, a book by Ian Goodfellow, Yoshua Bengio and Aaron Courville, is the most comprehensive resource for studying deep learning. It covers a lot more than all the other courses combined. ★★★

There are many software frameworks that provide necessary functions, classes and modules for machine learning and for deep learning in particular. We suggest younot use these frameworks at the early stages of studying, instead we suggest you implement the basic algorithms from scratch. Most of the courses describe the maths behind the algorithms in enough detail, so they can be easily implemented.

Jupyter notebooks are a convenient way to play with Python code. They are nicely integrated with matplotlib, a popular tool for visualizations. We suggest you implement algorithms in such environments. ★

Machine learning basics

Machine learning is the art and science of teaching computers based on data. It is a relatively established field at the intersection of computer science and mathematics, while deep learning is just a small subfield of it. The concepts and tools of machine learning are important for understanding deep learning.

Visual introduction to machine learning - decision trees ★
Andrew Ng's course on machine learning, the most popular course on Coursera ★★
Larochelle's course doesn't have separate introductory lectures for general machine learning, but all required concepts are defined and explained whenever needed.
1. Training and testing the models (kNN) ★★
2. Linear classification (SVM) ★★
3. Optimization (stochastic gradient descent) ★★
5. Machine learning basics ★★★
Principal Component Analysis explained visually ★
How to Use t-SNE Effectively ★★

Most of the popular machine learning algorithms are implemented in the Scikit-learn Python library. Implementing some of them from scratch helps with understanding how machine learning works.

Practical Machine Learning Tutorial with Python covers linear regression, k-nearest-neighbors and support vector machines. First it shows how to use them from scikit-learn, then implements the algorithms from scratch. ★
Andrew Ng's course on Coursera has many assignments in Octave language. The same algorithms can be implemented in Python. ★★

Neural networks basics

Neural networks are powerful machine learning algorithms. They form the basis of deep learning.

A Visual and Interactive Guide to the Basics of Neural Networks - shows how simple neural networks can do linear regression ★
1. Feedforward neural network ★★
2. Training neural networks (up to 2.7) ★★
4. Backpropagation ★★
5. Architecture of neural networks ★★
1. Using neural nets to recognize handwritten digits ★
2. How the backpropagation algorithm works ★
4. A visual proof that neural nets can compute any function ★
6. Deep feedforward networks ★★★
Yes you should understand backprop explains why it is important to implement backpropagation once from scratch ★★
Calculus on computational graphs: backpropagation ★★
Play with neural networks! ★

Try to implement a single layer neural network from scratch, including the training procedure.

Implementing softmax classifier and a simple neural network in pure Python/NumPy - Jupyter notebook available ★
Andrej Karpathy implements backpropagation in Javascript in hisHacker's guide to Neural Networks. ★
Implementing a neural network from scratch in Python ★

Improving the way neural networks learn

It's not very easy to train neural networks. Sometimes they don't learn at all (underfitting), sometimes they learn exactly what you give them and their "knowledge" does not generalize to new, unseen data (overfitting). There are many ways to handle these problems.

2.8-2.11. Regularization, parameter initialization etc. ★★
7.5. Dropout ★★
6 (first half). Setting up the data and loss ★★
3. Improving the way neural networks learn ★
5. Why are deep neural networks hard to train? ★
7. Regularization for deep learning ★★★
8. Optimization for training deep models ★★★
11. Practical methodology ★★★
ConvNetJS Trainer demo on MNIST - visualizes the performance of different optimization algorithms ★
An overview of gradient descent optimization algorithms ★★★
Neural Networks, Manifolds, and Topology ★★★

There are many frameworks that provide the standard algorithms and are optimised for good performance on modern hardware. Most of these frameworks have interfaces for Python with the notable exception of Torch, which requires Lua. Once you know how basic learning algorithms are implemented under the hood, it's time to choose a framework to build on.

Theano provides low-level primitives for constructing all kinds of neural networks. It is maintained bya machine learning group at University of Montreal. See also:Speeding up your neural network with Theano and the GPU - Jupyter notebook available ★
TensorFlow is another low-level framework. Its architecture is similar to Theano. It is maintained by the Google Brain team.
Torch is a popular framework that uses Lua language. The main disadvantage is that Lua's community is not as large as Python's. Torch is mostly maintained by Facebook and Twitter.

There are also higher-level frameworks that run on top of these:

Lasagne is a higher level framework built on top of Theano. It provides simple functions to create large networks with few lines of code.
Keras is a higher level framework that works on top of either Theano or TensorFlow.
If you need more guidance on which framework is right for you, seeLecture 12 of Stanford's CS231n. ★★

Convolutional neural networks

Convolutional networks ("CNNs") are a special kind of neural nets that use several clever tricks to learn faster and better. ConvNets essentially revolutionized computer vision and are heavily used in speech recognition and text classification as well.

9. Computer vision (up to 9.9) ★★
6 (second half). Intro to ConvNets ★★
7. Convolutional neural networks ★★
8. Localization and detection ★★
9. Visualization, Deep dream, Neural style, Adversarial examples ★★
13. Image segmentation (up to 38:00) includes upconvolutions ★★
6. Deep learning ★
9. Convolutional networks ★★★
Image Kernels explained visually - shows how convolutional filters (also known as image kernels) transform the image ★
ConvNetJS MNIST demo - live visualization of a convolutional network right in the browser ★
Conv Nets: A Modular Perspective ★★
Understanding Convolutions ★★★
Understanding Convolutional neural networks for NLP ★★

Convolutional networks are implemented in every major framework. It is usually easier to understand the code that is written using higher level libraries.

Theano: Convolutional Neural Networks (LeNet) ★★
Using Lasagne for training Deep Neural Networks ★
Detecting diabetic retinopathy in eye images - a blog post by one of the best performers of Diabetic retinopathy detection contest in Kaggle. Includes a good example of data augmentation. ★★
Face recognition for right whales using deep learning - the authors used different ConvNets for localization and classification. Code and models are available. ★★
Tensorflow: Convolutional neural networks for image classification on CIFAR-10 dataset ★★
Implementing a CNN for text classification in Tensorflow ★★
DeepDream implementation in TensorFlow ★★★
92.45% on CIFAR-10 in Torch - implements famous VGGNet network with batch normalization layers in Torch ★
Training and investigating Residual Nets - Residual networks perform very well on image classification tasks. Two researchers from Facebook and CornellTech implemented these networks in Torch ★★★
ConvNets in practice - lots of practical tips on using convolutional networks including data augmentation, transfer learning, fast implementations of convolution operation ★★

Recurrent neural networks

Recurrent networks ("RNNs") are designed to work with sequences. Usually they are used for sentence classification (e.g. sentiment analysis) and speech recognition, but also for text generation and even image generation.

The Unreasonable Effectiveness of Recurrent Neural Networks - describes how RNNs can generate text, math papers and C++ code ★
Hugo Larochelle's course doesn't cover recurrent neural networks (although it covers many topics that RNNs are used for). We suggest watchingRecurrent Neural Nets and LSTMs by Nando de Freitas to fill the gap ★★
10. Recurrent Neural Networks, Image Captioning, LSTM ★★
13. Soft attention (starting at 38:00) ★★
Michael Nielsen's book stops at convolutional networks. In theOther approaches to deep neural nets section there is just a brief review of simple recurrent networks and LSTMs. ★
10. Sequence Modeling: Recurrent and Recursive Nets ★★★
Recurrent neural networks from Stanford's CS224d (2016) by Richard Socher ★★
Understanding LSTM Networks ★★

Recurrent neural networks are also implemented in every modern framework.

Theano: Recurrent Neural Networks with Word Embeddings ★★★
Theano: LSTM Networks for Sentiment Analysis ★★★
Implementing a RNN with Python, Numpy and Theano ★★
Lasagne implementation of Karpathy's char-rnn ★
Combining CNN and RNN for spoken language identification in Lasagne ★
Automatic transliteration with LSTM using Lasagne ★
Tensorflow: Recurrent Neural Networks for language modeling ★★
Recurrent Neural Networks in Tensorflow ★★
Understanding and Implementing Deepmind's DRAW Model ★★★
LSTM implementation explained ★★
Torch implementation of Karpathy's char-rnn ★★★

Autoencoders

Autoencoders are neural networks designed for unsupervised learning, i.e. when the data is not labeled. They can be used for dimension reduction, pretraining of other neural networks, for data generation etc. Here we also include resources about an interesting hybrid of autoencoders and graphical models called variational autoencoders, although their mathematical basis is not introduced until the next section.

6. Autoencoder ★★
7.6. Deep autoencoder ★★
14. Videos and unsupervised learning (from 32:29) - this video also touches an exciting topic of generative adversarial networks. ★★
14. Autoencoders ★★★
ConvNetJS Denoising Autoencoder demo ★
Karol Gregor on Variational Autoencoders and Image Generation ★★

Most autoencoders are pretty easy to implement. We suggest you try to implement one before looking at complete examples.

Theano: Denoising autoencoders ★★
Diving Into TensorFlow With Stacked Autoencoders ★★
Variational Autoencoder in TensorFlow ★★
Training Autoencoders on ImageNet Using Torch 7 ★★
Building autoencoders in Keras ★

Probabilistic graphical models

Probabilistic graphical models (“PGMs”) form a separate subfield at the intersection of statistics and machine learning. There are many books and courses on PGMs in general. Here we present how these models are applied in the context of deep learning. Hugo Larochelle's course describes a few famous models, while the book Deep Learning devotes four chapters (16-19) to the theory and describes more than a dozen models in the last chapter. These topics require a lot of mathematics.

3. Conditional Random Fields ★★★
4. Training CRFs ★★★
5. Restricted Boltzman machine ★★★
7.7-7.9. Deep Belief Networks ★★★
9.10. Convolutional RBM ★★★
13. Linear Factor Models - first steps towards probabilistic models ★★★
16. Structured Probabilistic Models for Deep Learning ★★★
17. Monte Carlo Methods ★★★
18. Confronting the Partition Function ★★★
19. Approximate Inference ★★★
20. Deep Generative Models - includes Boltzmann machines (RBM, DBN, ...), variational autoencoders, generative adversarial networks, autoregressive models etc. ★★★
Generative models - a blog post on variational autoencoders, generative adversarial networks and their improvements by OpenAI. ★★★
The Neural Network Zoo attempts to organize lots of architectures using a single scheme. ★★

Higher level frameworks (Lasagne, Keras) do not implement graphical models. But there is a lot of code for Theano, Tensorflow and Torch.

Restricted Boltzmann Machines in Theano ★★★
Deep Belief Networks in Theano ★★★
Generating Large Images from Latent Vectors - uses a combination of variational autoencoders and generative adversarial networks. ★★★
Image Completion with Deep Learning in TensorFlow - another application of generative adversarial networks. ★★★
Generating Faces with Torch - Torch implementation of Generative Adversarial Networks ★★

The state of the art

Deep learning is a very active area of scientific research. To follow the state of the art one has to read new papers and follow important conferences. Usually every new idea is announced in a preprint paper on arxiv.org. Then some of them are submitted to conferences and are peer reviewed. The best of them are presented in the conferences and are published in journals. If the authors do not release code for their models, many people attempt to implement them and put them on GitHub. It takes a year or two before high quality blog posts, tutorials and videos appear on the web that properly explain the ideas and implementations.

Deep learning papers reading roadmap contains a long list of important papers.
Arxiv Sanity Preserver is a nice UI for browsing papers from arXiv.
Videolectures.net contains lots of videos on advanced topics.
/r/MachineLearning is a very active subreddit. All major new papers are discussed there.

We are going to keep this guide up to date.

If you find broken links or any other problems, please report anissue on GitHub.
Last updated on December 26, 2016

原网站：http://yerevann.com/a-guide-to-deep-learning/

A Guide to Deep Learning

A Guide to Deep Learning

Prerequisites

Machine learning basics

Neural networks basics

Improving the way neural networks learn

Convolutional neural networks

Recurrent neural networks

Autoencoders

Probabilistic graphical models

The state of the art

相关阅读

相关文章

相关问答