vision.models

vision.models

Overview of the models used for CV in fastai

Computer Vision models zoo

The fastai library includes several pretrained models from torchvision, namely:

resnet18, resnet34, resnet50, resnet101, resnet152
squeezenet1_0, squeezenet1_1
densenet121, densenet169, densenet201, densenet161
vgg16_bn, vgg19_bn
alexnet

On top of the models offered by torchvision, fastai has implementations for the following models:

Darknet architecture, which is the base of Yolo v3
Unet architecture based on a pretrained model. The original unet is described here, the model implementation is detailed in models.unet
Wide resnets architectures, as introduced in this article

`class` `Darknet`[source][test]

Darknet(num_blocks:Collection[int], num_classes:int, nf=32) :: PrePostInitMeta :: Module No tests found for Darknet. To contribute a test please refer to this guide and this discussion.

https://github.com/pjreddie/darknet

Create a Darknet with blocks of sizes given in num_blocks, ending with num_classes and using nf initial features. Darknet53 uses num_blocks = [1,2,8,8,4].

`class` `WideResNet`[source][test]

WideResNet(num_groups:int, N:int, num_classes:int, k:int=1, drop_p:float=0.0, start_nf:int=16, n_in_channels:int=3) :: PrePostInitMeta :: Module No tests found for WideResNet. To contribute a test please refer to this guide and this discussion.

Wide ResNet with num_groups and a width of k.

Each group contains N blocks. start_nf the initial number of features. Dropout of drop_p is applied in between the two convolutions in each block. The expected input channel size is fixed at 3.

Structure: initial convolution -> num_groups x N blocks -> final layers of regularization and pooling

The first block of each group joins a path containing 2 convolutions with filter size 3x3 (and various regularizations) with another path containing a single convolution with a filter size of 1x1. All other blocks in each group follow the more traditional res_block style, i.e., the input of the path with two convs is added to the output of that path.

In the first group the stride is 1 for all convolutions. In all subsequent groups the stride in the first convolution of the first block is 2 and then all following convolutions have a stride of 1. Padding is always 1.