来源:fastai
浏览 507
扫码
分享
2021-03-31 08:23:57
Questionnaire
- What is the equation for a step of SGD, in math or code (as you prefer)?
- What do we pass to
cnn_learner
to use a non-default optimizer? - What are optimizer callbacks?
- What does
zero_grad
do in an optimizer? - What does
step
do in an optimizer? How is it implemented in the general optimizer? - Rewrite
sgd_cb
to use the +=
operator, instead of add_
. - What is “momentum”? Write out the equation.
- What’s a physical analogy for momentum? How does it apply in our model training settings?
- What does a bigger value for momentum do to the gradients?
- What are the default values of momentum for 1cycle training?
- What is RMSProp? Write out the equation.
- What do the squared values of the gradients indicate?
- How does Adam differ from momentum and RMSProp?
- Write out the equation for Adam.
- Calculate the values of
unbias_avg
and w.avg
for a few batches of dummy values. - What’s the impact of having a high
eps
in Adam? - Read through the optimizer notebook in fastai’s repo, and execute it.
- In what situations do dynamic learning rate methods like Adam change the behavior of weight decay?
- What are the four steps of a training loop?
- Why is using callbacks better than writing a new training loop for each tweak you want to add?
- What aspects of the design of fastai’s callback system make it as flexible as copying and pasting bits of code?
- How can you get the list of events available to you when writing a callback?
- Write the
ModelResetter
callback (without peeking). - How can you access the necessary attributes of the training loop inside a callback? When can you use or not use the shortcuts that go with them?
- How can a callback influence the control flow of the training loop?
- Write the
TerminateOnNaN
callback (without peeking, if possible). - How do you make sure your callback runs after or before another callback?
Further Research
- Look up the “Rectified Adam” paper, implement it using the general optimizer framework, and try it out. Search for other recent optimizers that work well in practice, and pick one to implement.
- Look at the mixed-precision callback with the documentation. Try to understand what each event and line of code does.
- Implement your own version of the learning rate finder from scratch. Compare it with fastai’s version.
- Look at the source code of the callbacks that ship with fastai. See if you can find one that’s similar to what you’re looking to do, to get some inspiration.