Deep Learning for Collaborative Filtering
To turn our architecture into a deep learning model, the first step is to take the results of the embedding lookup and concatenate those activations together. This gives us a matrix which we can then pass through linear layers and nonlinearities in the usual way.
Since we’ll be concatenating the embeddings, rather than taking their dot product, the two embedding matrices can have different sizes (i.e., different numbers of latent factors). fastai has a function get_emb_sz
that returns recommended sizes for embedding matrices for your data, based on a heuristic that fast.ai has found tends to work well in practice:
In [ ]:
embs = get_emb_sz(dls)
embs
Out[ ]:
[(944, 74), (1635, 101)]
Let’s implement this class:
In [ ]:
class CollabNN(Module):
def __init__(self, user_sz, item_sz, y_range=(0,5.5), n_act=100):
self.user_factors = Embedding(*user_sz)
self.item_factors = Embedding(*item_sz)
self.layers = nn.Sequential(
nn.Linear(user_sz[1]+item_sz[1], n_act),
nn.ReLU(),
nn.Linear(n_act, 1))
self.y_range = y_range
def forward(self, x):
embs = self.user_factors(x[:,0]),self.item_factors(x[:,1])
x = self.layers(torch.cat(embs, dim=1))
return sigmoid_range(x, *self.y_range)
And use it to create a model:
In [ ]:
model = CollabNN(*embs)
CollabNN
creates our Embedding
layers in the same way as previous classes in this chapter, except that we now use the embs
sizes. self.layers
is identical to the mini-neural net we created in <> for MNIST. Then, in forward
, we apply the embeddings, concatenate the results, and pass this through the mini-neural net. Finally, we apply sigmoid_range
as we have in previous models.
Let’s see if it trains:
In [ ]:
learn = Learner(dls, model, loss_func=MSELossFlat())
learn.fit_one_cycle(5, 5e-3, wd=0.01)
epoch | train_loss | valid_loss | time |
---|---|---|---|
0 | 0.940104 | 0.959786 | 00:15 |
1 | 0.893943 | 0.905222 | 00:14 |
2 | 0.865591 | 0.875238 | 00:14 |
3 | 0.800177 | 0.867468 | 00:14 |
4 | 0.760255 | 0.867455 | 00:14 |
fastai provides this model in fastai.collab
if you pass use_nn=True
in your call to collab_learner
(including calling get_emb_sz
for you), and it lets you easily create more layers. For instance, here we’re creating two hidden layers, of size 100 and 50, respectively:
In [ ]:
learn = collab_learner(dls, use_nn=True, y_range=(0, 5.5), layers=[100,50])
learn.fit_one_cycle(5, 5e-3, wd=0.1)
epoch | train_loss | valid_loss | time |
---|---|---|---|
0 | 1.002747 | 0.972392 | 00:16 |
1 | 0.926903 | 0.922348 | 00:16 |
2 | 0.877160 | 0.893401 | 00:16 |
3 | 0.838334 | 0.865040 | 00:16 |
4 | 0.781666 | 0.864936 | 00:16 |
learn.model
is an object of type EmbeddingNN
. Let’s take a look at fastai’s code for this class:
In [ ]:
@delegates(TabularModel)
class EmbeddingNN(TabularModel):
def __init__(self, emb_szs, layers, **kwargs):
super().__init__(emb_szs, layers=layers, n_cont=0, out_sz=1, **kwargs)
Wow, that’s not a lot of code! This class inherits from TabularModel
, which is where it gets all its functionality from. In __init__
it calls the same method in TabularModel
, passing n_cont=0
and out_sz=1
; other than that, it only passes along whatever arguments it received.
Sidebar: kwargs and Delegates
EmbeddingNN
includes **kwargs
as a parameter to __init__
. In Python **kwargs
in a parameter list means “put any additional keyword arguments into a dict called kwargs
. And **kwargs
in an argument list means “insert all key/value pairs in the kwargs
dict as named arguments here”. This approach is used in many popular libraries, such as matplotlib
, in which the main plot
function simply has the signature plot(*args, **kwargs)
. The plot
documentation says “The kwargs
are Line2D
properties” and then lists those properties.
We’re using **kwargs
in EmbeddingNN
to avoid having to write all the arguments to TabularModel
a second time, and keep them in sync. However, this makes our API quite difficult to work with, because now Jupyter Notebook doesn’t know what parameters are available. Consequently things like tab completion of parameter names and pop-up lists of signatures won’t work.
fastai resolves this by providing a special @delegates
decorator, which automatically changes the signature of the class or function (EmbeddingNN
in this case) to insert all of its keyword arguments into the signature.
End sidebar
Although the results of EmbeddingNN
are a bit worse than the dot product approach (which shows the power of carefully constructing an architecture for a domain), it does allow us to do something very important: we can now directly incorporate other user and movie information, date and time information, or any other information that may be relevant to the recommendation. That’s exactly what TabularModel
does. In fact, we’ve now seen that EmbeddingNN
is just a TabularModel
, with n_cont=0
and out_sz=1
. So, we’d better spend some time learning about TabularModel
, and how to use it to get great results! We’ll do that in the next chapter.