1.Why do we first resize to a large size on the CPU, and then to a smaller size on the GPU?
首先,在训练模型时,我们希望能够将图片的尺寸统一,整理为张量,传入GPU,我们还希望最大限度地减少执行不同增强计算的数量。其次,在进行图片尺寸调整时,常见的数据增强方法可能会引起引入虚假空白数据,使数据降维的问题。
2.If you are not familiar with regular expressions, find a regular expression tutorial, and some problem sets, and complete them. Have a look on the book’s website for suggestions.
正则表达式
3.What are the two ways in which data is most commonly provided, for most deep learning datasets?
Data is usually provided in one of these two ways:
Individual files representing items of data, such as text documents or images, possibly organized into folders or with filenames representing information about those items
A table of data, such as in CSV format, where each row is an item which may include filenames providing a connection between the data in the table and data in other formats, such as text documents and images
4.Look up the documentation for L and try using a few of the new methods that it adds.
??L,确实有很多功能
5.Look up the documentation for the Python pathlib module and try using a few methods of the Path class.
https://docs.python.org/zh-cn/3/library/pathlib.html
6.Give two examples of ways that image transformations can degrade the quality of the data.
(1)旋转会带来空白
(2)resize需要插值
7.What method does fastai provide to view the data in a DataLoaders?
DataLoaders.show_batch
8.What method does fastai provide to help you debug a DataBlock?
DataLoaders.summary
If you made a mistake while building your DataBlock, it is very likely you won’t see it before this step. To debug this, we encourage you to use the summary method. It will attempt to create a batch from the source you give it, with a lot of details. Also, if it fails, you will see exactly at which point the error happens, and the library will try to give you some help. For instance, one common mistake is to forget to use a Resize transform, so you end up with pictures of different sizes and are not able to batch them. Here is what the summary would look like in that case (note that the exact text may have changed since the time of writing, but it will give you an idea):
9.Should you hold off on training a model until you have thoroughly cleaned your data?
No.
10.What are the two pieces that are combined into cross-entropy loss in PyTorch?
A combination of a Softmax function and Negative Log Likelihood Loss.
11.What are the two properties of activations that softmax ensures? Why is this important?
In our classification model, we use the softmax activation function in the final layer to ensure that the activations are all between 0 and 1, and that they sum to 1.
12.When might you want your activations to not have these two properties?
Multi-label classification problem.
13.Calculate the exp and softmax columns of <> yourself (i.e., in a spreadsheet, with a calculator, or in a notebook).
简单计算
14.Why can’t we use torch.where to create a loss function for datasets where our label can have more than two categories?
torch.where只能用于二分类
15.What is the value of log(-2)? Why?
没有意义
16.What are two good rules of thumb for picking a learning rate from the learning rate finder?
(1)最小值除以10,Minimum/10
(2)斜率最大值,steepest point
17.What two steps does the fine_tune method do?
When we call the fine_tune method fastai does two things:
Trains the randomly added layers for one epoch, with all other layers frozen
Unfreezes all of the layers, and trains them all for the number of epochs requested
18.In Jupyter Notebook, how do you get the source code for a method or function?
??
19.What are discriminative learning rates?
对模型不同层,采用不同的学习率,较早的层高,较晚的层低。
20.How is a Python slice object interpreted when passed as a learning rate to fastai?
The first value passed will be the learning rate in the earliest layer of the neural network, and the second value will be the learning rate in the final layer. The layers in between will have learning rates that are multiplicatively equidistant throughout that range.
21.Why is early stopping a poor choice when using 1cycle training?
Before the days of 1cycle training it was very common to save the model at the end of each epoch, and then select whichever model had the best accuracy out of all of the models saved in each epoch. This is known as early stopping. However, this is very unlikely to give you the best answer, because those epochs in the middle occur before the learning rate has had a chance to reach the small values, where it can really find the best result. Therefore, if you find that you have overfit, what you should actually do is retrain your model from scratch, and this time select a total number of epochs based on where your previous best results were found.
22.What is the difference between resnet50 and resnet101?
resnet101规模更大。
23.What does to_fp16 do?
One technique that can speed things up a lot is mixed-precision training. This refers to using less-precise numbers (half-precision floating point, also called fp16) where possible during training.
使用更低的精度的数据进行训练。