Applications

Keras Applications are deep learning models that are made available alongside pre-trained weights.These models can be used for prediction, feature extraction, and fine-tuning.

Weights are downloaded automatically when instantiating a model. They are stored at ~/.keras/models/.

Available models

Models for image classification with weights trained on ImageNet:

All of these architectures are compatible with all the backends (TensorFlow, Theano, and CNTK), and upon instantiation the models will be built according to the image data format set in your Keras configuration file at ~/.keras/keras.json. For instance, if you have set image_data_format=channels_last, then any model loaded from this repository will get built according to the TensorFlow data format convention, "Height-Width-Depth".

Note that:- For Keras < 2.2.0, The Xception model is only available for TensorFlow, due to its reliance on SeparableConvolution layers.- For Keras < 2.1.5, The MobileNet model is only available for TensorFlow, due to its reliance on DepthwiseConvolution layers.

Usage examples for image classification models

Classify ImageNet classes with ResNet50

  1. from keras.applications.resnet50 import ResNet50
  2. from keras.preprocessing import image
  3. from keras.applications.resnet50 import preprocess_input, decode_predictions
  4. import numpy as np
  5. model = ResNet50(weights='imagenet')
  6. img_path = 'elephant.jpg'
  7. img = image.load_img(img_path, target_size=(224, 224))
  8. x = image.img_to_array(img)
  9. x = np.expand_dims(x, axis=0)
  10. x = preprocess_input(x)
  11. preds = model.predict(x)
  12. # decode the results into a list of tuples (class, description, probability)
  13. # (one such list for each sample in the batch)
  14. print('Predicted:', decode_predictions(preds, top=3)[0])
  15. # Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), (u'n01871265', u'tusker', 0.1122357), (u'n02504458', u'African_elephant', 0.061040461)]

Extract features with VGG16

  1. from keras.applications.vgg16 import VGG16
  2. from keras.preprocessing import image
  3. from keras.applications.vgg16 import preprocess_input
  4. import numpy as np
  5. model = VGG16(weights='imagenet', include_top=False)
  6. img_path = 'elephant.jpg'
  7. img = image.load_img(img_path, target_size=(224, 224))
  8. x = image.img_to_array(img)
  9. x = np.expand_dims(x, axis=0)
  10. x = preprocess_input(x)
  11. features = model.predict(x)

Extract features from an arbitrary intermediate layer with VGG19

  1. from keras.applications.vgg19 import VGG19
  2. from keras.preprocessing import image
  3. from keras.applications.vgg19 import preprocess_input
  4. from keras.models import Model
  5. import numpy as np
  6. base_model = VGG19(weights='imagenet')
  7. model = Model(inputs=base_model.input, outputs=base_model.get_layer('block4_pool').output)
  8. img_path = 'elephant.jpg'
  9. img = image.load_img(img_path, target_size=(224, 224))
  10. x = image.img_to_array(img)
  11. x = np.expand_dims(x, axis=0)
  12. x = preprocess_input(x)
  13. block4_pool_features = model.predict(x)

Fine-tune InceptionV3 on a new set of classes

  1. from keras.applications.inception_v3 import InceptionV3
  2. from keras.preprocessing import image
  3. from keras.models import Model
  4. from keras.layers import Dense, GlobalAveragePooling2D
  5. from keras import backend as K
  6. # create the base pre-trained model
  7. base_model = InceptionV3(weights='imagenet', include_top=False)
  8. # add a global spatial average pooling layer
  9. x = base_model.output
  10. x = GlobalAveragePooling2D()(x)
  11. # let's add a fully-connected layer
  12. x = Dense(1024, activation='relu')(x)
  13. # and a logistic layer -- let's say we have 200 classes
  14. predictions = Dense(200, activation='softmax')(x)
  15. # this is the model we will train
  16. model = Model(inputs=base_model.input, outputs=predictions)
  17. # first: train only the top layers (which were randomly initialized)
  18. # i.e. freeze all convolutional InceptionV3 layers
  19. for layer in base_model.layers:
  20. layer.trainable = False
  21. # compile the model (should be done *after* setting layers to non-trainable)
  22. model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
  23. # train the model on the new data for a few epochs
  24. model.fit_generator(...)
  25. # at this point, the top layers are well trained and we can start fine-tuning
  26. # convolutional layers from inception V3. We will freeze the bottom N layers
  27. # and train the remaining top layers.
  28. # let's visualize layer names and layer indices to see how many layers
  29. # we should freeze:
  30. for i, layer in enumerate(base_model.layers):
  31. print(i, layer.name)
  32. # we chose to train the top 2 inception blocks, i.e. we will freeze
  33. # the first 249 layers and unfreeze the rest:
  34. for layer in model.layers[:249]:
  35. layer.trainable = False
  36. for layer in model.layers[249:]:
  37. layer.trainable = True
  38. # we need to recompile the model for these modifications to take effect
  39. # we use SGD with a low learning rate
  40. from keras.optimizers import SGD
  41. model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')
  42. # we train our model again (this time fine-tuning the top 2 inception blocks
  43. # alongside the top Dense layers
  44. model.fit_generator(...)

Build InceptionV3 over a custom input tensor

  1. from keras.applications.inception_v3 import InceptionV3
  2. from keras.layers import Input
  3. # this could also be the output a different Keras model or layer
  4. input_tensor = Input(shape=(224, 224, 3)) # this assumes K.image_data_format() == 'channels_last'
  5. model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=True)

Documentation for individual models

ModelSizeTop-1 AccuracyTop-5 AccuracyParametersDepth
Xception88 MB0.7900.94522,910,480126
VGG16528 MB0.7130.901138,357,54423
VGG19549 MB0.7130.900143,667,24026
ResNet5098 MB0.7490.92125,636,712-
ResNet101171 MB0.7640.92844,707,176-
ResNet152232 MB0.7660.93160,419,944-
ResNet50V298 MB0.7600.93025,613,800-
ResNet101V2171 MB0.7720.93844,675,560-
ResNet152V2232 MB0.7800.94260,380,648-
InceptionV392 MB0.7790.93723,851,784159
InceptionResNetV2215 MB0.8030.95355,873,736572
MobileNet16 MB0.7040.8954,253,86488
MobileNetV214 MB0.7130.9013,538,98488
DenseNet12133 MB0.7500.9238,062,504121
DenseNet16957 MB0.7620.93214,307,880169
DenseNet20180 MB0.7730.93620,242,984201
NASNetMobile23 MB0.7440.9195,326,716-
NASNetLarge343 MB0.8250.96088,949,818-

The top-1 and top-5 accuracy refers to the model's performance on the ImageNet validation dataset.

Depth refers to the topological depth of the network. This includes activation layers, batch normalization layers etc.

Xception

  1. keras.applications.xception.Xception(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

Xception V1 model, with weights pre-trained on ImageNet.

On ImageNet, this model gets to a top-1 validation accuracy of 0.790and a top-5 validation accuracy of 0.945.

This model and can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels).

The default input size for this model is 299x299.

Arguments

  • include_top: whether to include the fully-connected layer at the top of the network.
  • weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet).
  • input_tensor: optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model.
  • input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (299, 299, 3). It should have exactly 3 inputs channels, and width and height should be no smaller than 71. E.g. (150, 150, 3) would be one valid value.
  • pooling: Optional pooling mode for feature extraction when include_top is False.
    • None means that the output of the model will be the 4D tensor output of the last convolutional block.
    • 'avg' means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
    • 'max' means that global max pooling will be applied.
  • classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.

Returns

A Keras Model instance.

References

License

These weights are trained by ourselves and are released under the MIT license.

VGG16

  1. keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

VGG16 model, with weights pre-trained on ImageNet.

This model can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels).

The default input size for this model is 224x224.

Arguments

  • include_top: whether to include the 3 fully-connected layers at the top of the network.
  • weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet).
  • input_tensor: optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model.
  • input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value.
  • pooling: Optional pooling mode for feature extraction when include_top is False.
    • None means that the output of the model will be the 4D tensor output of the last convolutional block.
    • 'avg' means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
    • 'max' means that global max pooling will be applied.
  • classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.

Returns

A Keras Model instance.

References

License

These weights are ported from the ones released by VGG at Oxford under the Creative Commons Attribution License.

VGG19

  1. keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

VGG19 model, with weights pre-trained on ImageNet.

This model can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels).

The default input size for this model is 224x224.

Arguments

  • include_top: whether to include the 3 fully-connected layers at the top of the network.
  • weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet).
  • input_tensor: optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model.
  • input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value.
  • pooling: Optional pooling mode for feature extraction when include_top is False.
    • None means that the output of the model will be the 4D tensor output of the last convolutional block.
    • 'avg' means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
    • 'max' means that global max pooling will be applied.
  • classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.

Returns

A Keras Model instance.

References

License

These weights are ported from the ones released by VGG at Oxford under the Creative Commons Attribution License.

ResNet

  1. keras.applications.resnet.ResNet50(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
  2. keras.applications.resnet.ResNet101(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
  3. keras.applications.resnet.ResNet152(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
  4. keras.applications.resnet_v2.ResNet50V2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
  5. keras.applications.resnet_v2.ResNet101V2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
  6. keras.applications.resnet_v2.ResNet152V2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

ResNet, ResNetV2 models, with weights pre-trained on ImageNet.

This model and can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels).

The default input size for this model is 224x224.

Arguments

  • include_top: whether to include the fully-connected layer at the top of the network.
  • weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet).
  • input_tensor: optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model.
  • input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value.
  • pooling: Optional pooling mode for feature extraction when include_top is False.
    • None means that the output of the model will be the 4D tensor output of the last convolutional block.
    • 'avg' means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
    • 'max' means that global max pooling will be applied.
  • classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.

Returns

A Keras Model instance.

References

License

These weights are ported from the following:

InceptionV3

  1. keras.applications.inception_v3.InceptionV3(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

Inception V3 model, with weights pre-trained on ImageNet.

This model and can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels).

The default input size for this model is 299x299.

Arguments

  • include_top: whether to include the fully-connected layer at the top of the network.
  • weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet).
  • input_tensor: optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model.
  • input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (299, 299, 3) (with 'channels_last' data format) or (3, 299, 299) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 75. E.g. (150, 150, 3) would be one valid value.
  • pooling: Optional pooling mode for feature extraction when include_top is False.
    • None means that the output of the model will be the 4D tensor output of the last convolutional block.
    • 'avg' means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
    • 'max' means that global max pooling will be applied.
  • classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.

Returns

A Keras Model instance.

References

License

These weights are released under the Apache License.

InceptionResNetV2

  1. keras.applications.inception_resnet_v2.InceptionResNetV2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

Inception-ResNet V2 model, with weights pre-trained on ImageNet.

This model and can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels).

The default input size for this model is 299x299.

Arguments

  • include_top: whether to include the fully-connected layer at the top of the network.
  • weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet).
  • input_tensor: optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model.
  • input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (299, 299, 3) (with 'channels_last' data format) or (3, 299, 299) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 75. E.g. (150, 150, 3) would be one valid value.
  • pooling: Optional pooling mode for feature extraction when include_top is False.
    • None means that the output of the model will be the 4D tensor output of the last convolutional block.
    • 'avg' means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
    • 'max' means that global max pooling will be applied.
  • classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.

Returns

A Keras Model instance.

References

License

These weights are released under the Apache License.

MobileNet

  1. keras.applications.mobilenet.MobileNet(input_shape=None, alpha=1.0, depth_multiplier=1, dropout=1e-3, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000)

MobileNet model, with weights pre-trained on ImageNet.

This model and can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels).

The default input size for this model is 224x224.

Arguments

  • input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value.
  • alpha: controls the width of the network.
    • If alpha < 1.0, proportionally decreases the number of filters in each layer.
    • If alpha > 1.0, proportionally increases the number of filters in each layer.
    • If alpha = 1, default number of filters from the paper are used at each layer.
  • depth_multiplier: depth multiplier for depthwise convolution (also called the resolution multiplier)
  • dropout: dropout rate
  • include_top: whether to include the fully-connected layer at the top of the network.
  • weights: None (random initialization) or 'imagenet' (ImageNet weights)
  • input_tensor: optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model.
  • pooling: Optional pooling mode for feature extraction when include_top is False.
    • None means that the output of the modelwill be the 4D tensor output of the last convolutional block.
    • 'avg' means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
    • 'max' means that global max pooling will be applied.
  • classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.

Returns

A Keras Model instance.

References

License

These weights are released under the Apache License.

DenseNet

  1. keras.applications.densenet.DenseNet121(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
  2. keras.applications.densenet.DenseNet169(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
  3. keras.applications.densenet.DenseNet201(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

DenseNet models, with weights pre-trained on ImageNet.

This model and can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels).

The default input size for this model is 224x224.

Arguments

  • blocks: numbers of building blocks for the four dense layers.
  • include_top: whether to include the fully-connected layer at the top of the network.
  • weights: one of None (random initialization), 'imagenet' (pre-training on ImageNet), or the path to the weights file to be loaded.
  • input_tensor: optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model.
  • input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value.
  • pooling: optional pooling mode for feature extraction when include_top is False.
    • None means that the output of the model will be the 4D tensor output of the last convolutional block.
    • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
    • max means that global max pooling will be applied.
  • classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.

Returns

A Keras model instance.

References

License

These weights are released under the BSD 3-clause License.

NASNet

  1. keras.applications.nasnet.NASNetLarge(input_shape=None, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000)
  2. keras.applications.nasnet.NASNetMobile(input_shape=None, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000)

Neural Architecture Search Network (NASNet) models, with weights pre-trained on ImageNet.

The default input size for the NASNetLarge model is 331x331 and for theNASNetMobile model is 224x224.

Arguments

  • input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format) for NASNetMobile or (331, 331, 3) (with 'channels_last' data format) or (3, 331, 331) (with 'channels_first' data format) for NASNetLarge. It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value.
  • include_top: whether to include the fully-connected layer at the top of the network.
  • weights: None (random initialization) or 'imagenet' (ImageNet weights)
  • input_tensor: optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model.
  • pooling: Optional pooling mode for feature extraction when include_top is False.
    • None means that the output of the modelwill be the 4D tensor output of the last convolutional block.
    • 'avg' means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
    • 'max' means that global max pooling will be applied.
  • classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.

Returns

A Keras Model instance.

References

License

These weights are released under the Apache License.

MobileNetV2

  1. keras.applications.mobilenet_v2.MobileNetV2(input_shape=None, alpha=1.0, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000)

MobileNetV2 model, with weights pre-trained on ImageNet.

This model and can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels).

The default input size for this model is 224x224.

Arguments

  • input_shape: optional shape tuple, to be specified if you would like to use a model with an input img resolution that is not (224, 224, 3). It should have exactly 3 inputs channels (224, 224, 3). You can also omit this option if you would like to infer input_shape from an input_tensor. If you choose to include both input_tensor and input_shape then input_shape will be used if they match, if the shapes do not match then we will throw an error. E.g. (160, 160, 3) would be one valid value.
  • alpha: controls the width of the network. This is known as the width multiplier in the MobileNetV2 paper.
    • If alpha < 1.0, proportionally decreases the number of filters in each layer.
    • If alpha > 1.0, proportionally increases the number of filters in each layer.
    • If alpha = 1, default number of filters from the paper are used at each layer.
  • include_top: whether to include the fully-connected layer at the top of the network.
  • weights: one of None (random initialization), 'imagenet' (pre-training on ImageNet), or the path to the weights file to be loaded.
  • input_tensor: optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model.
  • pooling: Optional pooling mode for feature extraction when include_top is False.
    • None means that the output of the modelwill be the 4D tensor output of the last convolutional block.
    • 'avg' means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
    • 'max' means that global max pooling will be applied.
  • classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.

Returns

A Keras model instance.

Raises

ValueError: in case of invalid argument for weights, or invalid input shape, alpha, rows when weights='imagenet'

References

License

These weights are released under the Apache License.