Questionnaire
- What is a “hook” in PyTorch?
- Which layer does CAM use the outputs of?
- Why does CAM require a hook?
- Look at the source code of the
ActivationStats
class and see how it uses hooks. - Write a hook that stores the activations of a given layer in a model (without peeking, if possible).
- Why do we call
eval
before getting the activations? Why do we useno_grad
? - Use
torch.einsum
to compute the “dog” or “cat” score of each of the locations in the last activation of the body of the model. - How do you check which order the categories are in (i.e., the correspondence of index->category)?
- Why are we using
decode
when displaying the input image? - What is a “context manager”? What special methods need to be defined to create one?
- Why can’t we use plain CAM for the inner layers of a network?
- Why do we need to register a hook on the backward pass in order to do Grad-CAM?
- Why can’t we call
output.backward()
whenoutput
is a rank-2 tensor of output activations per image per class?
Further Research
- Try removing
keepdim
and see what happens. Look up this parameter in the PyTorch docs. Why do we need it in this notebook? - Create a notebook like this one, but for NLP, and use it to find which words in a movie review are most significant in assessing the sentiment of a particular movie review.
In [ ]: