RNN
keras.engine.base_layer.wrapped_fn()
Base class for recurrent layers.
Arguments
cell: A RNN cell instance. A RNN cell is a class that has:
- a
call(input_at_t, states_at_t)
method, returning(output_at_t, states_at_t_plus_1)
. The call method of the cell can also take the optional argumentconstants
, see section "Note on passing external constants" below. - a
state_size
attribute. This can be a single integer (single state) in which case it is the size of the recurrent state (which should be the same as the size of the cell output). This can also be a list/tuple of integers (one size per state). - a
output_size
attribute. This can be a single integer or a TensorShape, which represent the shape of the output. For backward compatible reason, if this attribute is not available for the cell, the value will be inferred by the first element of thestate_size
.It is also possible forcell
to be a list of RNN cell instances,in which cases the cells get stacked one after the other in the RNN,implementing an efficient stacked RNN.
- a
return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.
return_state: Boolean. Whether to return the last state in addition to the output.
- go_backwards: Boolean (default False). If True, process the input sequence backwards and return the reversed sequence.
- stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.
- unroll: Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences.
- input_dim: dimensionality of the input (integer). This argument (or alternatively, the keyword argument
input_shape
) is required when using this layer as the first layer in a model. - input_length: Length of input sequences, to be specified when it is constant. This argument is required if you are going to connect
Flatten
thenDense
layers upstream (without it, the shape of the dense outputs cannot be computed). Note that if the recurrent layer is not the first layer in your model, you would need to specify the input length at the level of the first layer (e.g. via theinput_shape
argument)
Input shape
3D tensor with shape (batch_size, timesteps, input_dim)
.
Output shape
- if
return_state
: a list of tensors. The first tensor is the output. The remaining tensors are the last states, each with shape(batch_size, units)
. For example, the number of state tensors is 1 (for RNN and GRU) or 2 (for LSTM). - if
return_sequences
: 3D tensor with shape(batch_size, timesteps, units)
. - else, 2D tensor with shape
(batch_size, units)
.
Masking
This layer supports masking for input data with a variable numberof timesteps. To introduce masks to your data,use an Embedding layer with the mask_zero
parameterset to True
.
Note on using statefulness in RNNs
You can set RNN layers to be 'stateful', which means that the statescomputed for the samples in one batch will be reused as initial statesfor the samples in the next batch. This assumes a one-to-one mappingbetween samples in different successive batches.
To enable statefulness:- specify stateful=True
in the layer constructor.- specify a fixed batch size for your model, by passingif sequential model:batchinput_shape=(…)
to the first layer in your model.else for functional model with 1 or more Input layers:batch_shape=(…)
to all the first layers in your model.This is the expected shape of your inputs_including the batch size.It should be a tuple of integers, e.g. (32, 10, 100)
.- specify shuffle=False
when calling fit().
To reset the states of your model, call .reset_states()
on eithera specific layer, or on your entire model.
Note on specifying the initial state of RNNs
You can specify the initial state of RNN layers symbolically bycalling them with the keyword argument initial_state
. The value ofinitial_state
should be a tensor or list of tensors representingthe initial state of the RNN layer.
You can specify the initial state of RNN layers numerically bycalling reset_states
with the keyword argument states
. The value ofstates
should be a numpy array or list of numpy arrays representingthe initial state of the RNN layer.
Note on passing external constants to RNNs
You can pass "external" constants to the cell using the constants
keyword argument of RNN.call
(as well as RNN.call
) method. Thisrequires that the cell.call
method accepts the same keyword argumentconstants
. Such constants can be used to condition the celltransformation on additional static inputs (not changing over time),a.k.a. an attention mechanism.
Examples
# First, let's define a RNN Cell, as a layer subclass.
class MinimalRNNCell(keras.layers.Layer):
def __init__(self, units, **kwargs):
self.units = units
self.state_size = units
super(MinimalRNNCell, self).__init__(**kwargs)
def build(self, input_shape):
self.kernel = self.add_weight(shape=(input_shape[-1], self.units),
initializer='uniform',
name='kernel')
self.recurrent_kernel = self.add_weight(
shape=(self.units, self.units),
initializer='uniform',
name='recurrent_kernel')
self.built = True
def call(self, inputs, states):
prev_output = states[0]
h = K.dot(inputs, self.kernel)
output = h + K.dot(prev_output, self.recurrent_kernel)
return output, [output]
# Let's use this cell in a RNN layer:
cell = MinimalRNNCell(32)
x = keras.Input((None, 5))
layer = RNN(cell)
y = layer(x)
# Here's how to use the cell to build a stacked RNN:
cells = [MinimalRNNCell(32), MinimalRNNCell(64)]
x = keras.Input((None, 5))
layer = RNN(cells)
y = layer(x)
SimpleRNN
keras.layers.SimpleRNN(units, activation='tanh', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False)
Fully-connected RNN where the output is to be fed back to input.
Arguments
- units: Positive integer, dimensionality of the output space.
- activation: Activation function to use (see activations). Default: hyperbolic tangent (
tanh
). If you passNone
, no activation is applied (ie. "linear" activation:a(x) = x
). - use_bias: Boolean, whether the layer uses a bias vector.
- kernel_initializer: Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs (see initializers). - recurrent_initializer: Initializer for the
recurrent_kernel
weights matrix, used for the linear transformation of the recurrent state (see initializers). - bias_initializer: Initializer for the bias vector (see initializers).
- kernel_regularizer: Regularizer function applied to the
kernel
weights matrix (see regularizer). - recurrent_regularizer: Regularizer function applied to the
recurrent_kernel
weights matrix (see regularizer). - bias_regularizer: Regularizer function applied to the bias vector (see regularizer).
- activity_regularizer: Regularizer function applied to the output of the layer (its "activation"). (see regularizer).
- kernel_constraint: Constraint function applied to the
kernel
weights matrix (see constraints). - recurrent_constraint: Constraint function applied to the
recurrent_kernel
weights matrix (see constraints). - bias_constraint: Constraint function applied to the bias vector (see constraints).
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.
- recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state.
- return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.
- return_state: Boolean. Whether to return the last state in addition to the output.
- go_backwards: Boolean (default False). If True, process the input sequence backwards and return the reversed sequence.
- stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.
- unroll: Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences.
GRU
keras.layers.GRU(units, activation='tanh', recurrent_activation='sigmoid', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, implementation=2, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False, reset_after=False)
Gated Recurrent Unit - Cho et al. 2014.
There are two variants. The default one is based on 1406.1078v3 andhas reset gate applied to hidden state before matrix multiplication. Theother one is based on original 1406.1078v1 and has the order reversed.
The second variant is compatible with CuDNNGRU (GPU-only) and allowsinference on CPU. Thus it has separate biases for kernel
andrecurrent_kernel
. Use 'reset_after'=True
andrecurrent_activation='sigmoid'
.
Arguments
- units: Positive integer, dimensionality of the output space.
- activation: Activation function to use (see activations). Default: hyperbolic tangent (
tanh
). If you passNone
, no activation is applied (ie. "linear" activation:a(x) = x
). - recurrent_activation: Activation function to use for the recurrent step (see activations). Default: hard sigmoid (
hard_sigmoid
). If you passNone
, no activation is applied (ie. "linear" activation:a(x) = x
). - use_bias: Boolean, whether the layer uses a bias vector.
- kernel_initializer: Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs (see initializers). - recurrent_initializer: Initializer for the
recurrent_kernel
weights matrix, used for the linear transformation of the recurrent state (see initializers). - bias_initializer: Initializer for the bias vector (see initializers).
- kernel_regularizer: Regularizer function applied to the
kernel
weights matrix (see regularizer). - recurrent_regularizer: Regularizer function applied to the
recurrent_kernel
weights matrix (see regularizer). - bias_regularizer: Regularizer function applied to the bias vector (see regularizer).
- activity_regularizer: Regularizer function applied to the output of the layer (its "activation"). (see regularizer).
- kernel_constraint: Constraint function applied to the
kernel
weights matrix (see constraints). - recurrent_constraint: Constraint function applied to the
recurrent_kernel
weights matrix (see constraints). - bias_constraint: Constraint function applied to the bias vector (see constraints).
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.
- recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state.
- implementation: Implementation mode, either 1 or 2. Mode 1 will structure its operations as a larger number of smaller dot products and additions, whereas mode 2 will batch them into fewer, larger operations. These modes will have different performance profiles on different hardware and for different applications.
- return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.
- return_state: Boolean. Whether to return the last state in addition to the output.
- go_backwards: Boolean (default False). If True, process the input sequence backwards and return the reversed sequence.
- stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.
- unroll: Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences.
- reset_after: GRU convention (whether to apply reset gate after or before matrix multiplication). False = "before" (default), True = "after" (CuDNN compatible).
References
- Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
- On the Properties of Neural Machine Translation: Encoder-Decoder Approaches
- Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
- A Theoretically Grounded Application of Dropout in Recurrent Neural Networks
LSTM
keras.layers.LSTM(units, activation='tanh', recurrent_activation='sigmoid', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, implementation=2, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False)
Long Short-Term Memory layer - Hochreiter 1997.
Arguments
- units: Positive integer, dimensionality of the output space.
- activation: Activation function to use (see activations). Default: hyperbolic tangent (
tanh
). If you passNone
, no activation is applied (ie. "linear" activation:a(x) = x
). - recurrent_activation: Activation function to use for the recurrent step (see activations). Default: hard sigmoid (
hard_sigmoid
). If you passNone
, no activation is applied (ie. "linear" activation:a(x) = x
). - use_bias: Boolean, whether the layer uses a bias vector.
- kernel_initializer: Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs. (see initializers). - recurrent_initializer: Initializer for the
recurrent_kernel
weights matrix, used for the linear transformation of the recurrent state. (see initializers). - bias_initializer: Initializer for the bias vector (see initializers).
- unit_forget_bias: Boolean. If True, add 1 to the bias of the forget gate at initialization. Setting it to true will also force
bias_initializer="zeros"
. This is recommended in Jozefowicz et al. (2015). - kernel_regularizer: Regularizer function applied to the
kernel
weights matrix (see regularizer). - recurrent_regularizer: Regularizer function applied to the
recurrent_kernel
weights matrix (see regularizer). - bias_regularizer: Regularizer function applied to the bias vector (see regularizer).
- activity_regularizer: Regularizer function applied to the output of the layer (its "activation"). (see regularizer).
- kernel_constraint: Constraint function applied to the
kernel
weights matrix (see constraints). - recurrent_constraint: Constraint function applied to the
recurrent_kernel
weights matrix (see constraints). - bias_constraint: Constraint function applied to the bias vector (see constraints).
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.
- recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state.
- implementation: Implementation mode, either 1 or 2. Mode 1 will structure its operations as a larger number of smaller dot products and additions, whereas mode 2 will batch them into fewer, larger operations. These modes will have different performance profiles on different hardware and for different applications.
- return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.
- return_state: Boolean. Whether to return the last state in addition to the output. The returned elements of the states list are the hidden state and the cell state, respectively.
- go_backwards: Boolean (default False). If True, process the input sequence backwards and return the reversed sequence.
- stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.
- unroll: Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences.
References
- Long short-term memory
- Learning to forget: Continual prediction with LSTM
- Supervised sequence labeling with recurrent neural networks
- A Theoretically Grounded Application of Dropout in Recurrent Neural Networks
ConvLSTM2D
keras.layers.ConvLSTM2D(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, dilation_rate=(1, 1), activation='tanh', recurrent_activation='hard_sigmoid', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, return_sequences=False, go_backwards=False, stateful=False, dropout=0.0, recurrent_dropout=0.0)
Convolutional LSTM.
It is similar to an LSTM layer, but the input transformationsand recurrent transformations are both convolutional.
Arguments
- filters: Integer, the dimensionality of the output space (i.e. the number output of filters in the convolution).
- kernel_size: An integer or tuple/list of n integers, specifying the dimensions of the convolution window.
- strides: An integer or tuple/list of n integers, specifying the strides of the convolution. Specifying any stride value != 1 is incompatible with specifying any
dilation_rate
value != 1. - padding: One of
"valid"
or"same"
(case-insensitive). - data_format: A string, one of
"channels_last"
(default) or"channels_first"
. The ordering of the dimensions in the inputs."channels_last"
corresponds to inputs with shape(batch, time, …, channels)
while"channels_first"
corresponds to inputs with shape(batch, time, channels, …)
. It defaults to theimage_data_format
value found in your Keras config file at~/.keras/keras.json
. If you never set it, then it will be"channels_last"
. - dilation_rate: An integer or tuple/list of n integers, specifying the dilation rate to use for dilated convolution. Currently, specifying any
dilation_rate
value != 1 is incompatible with specifying anystrides
value != 1. - activation: Activation function to use (see activations).
- recurrent_activation: Activation function to use for the recurrent step (see activations).
- use_bias: Boolean, whether the layer uses a bias vector.
- kernel_initializer: Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs. (see initializers). - recurrent_initializer: Initializer for the
recurrent_kernel
weights matrix, used for the linear transformation of the recurrent state. (see initializers). - bias_initializer: Initializer for the bias vector (see initializers).
- unit_forget_bias: Boolean. If True, add 1 to the bias of the forget gate at initialization. Use in combination with
bias_initializer="zeros"
. This is recommended in Jozefowicz et al. (2015). - kernel_regularizer: Regularizer function applied to the
kernel
weights matrix (see regularizer). - recurrent_regularizer: Regularizer function applied to the
recurrent_kernel
weights matrix (see regularizer). - bias_regularizer: Regularizer function applied to the bias vector (see regularizer).
- activity_regularizer: Regularizer function applied to the output of the layer (its "activation"). (see regularizer).
- kernel_constraint: Constraint function applied to the
kernel
weights matrix (see constraints). - recurrent_constraint: Constraint function applied to the
recurrent_kernel
weights matrix (see constraints). - bias_constraint: Constraint function applied to the bias vector (see constraints).
- return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.
- go_backwards: Boolean (default False). If True, process the input sequence backwards.
- stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.
- recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state.
Input shape
- if data_format='channels_first' 5D tensor with shape:
(samples, time, channels, rows, cols)
- if data_format='channels_last' 5D tensor with shape:
(samples, time, rows, cols, channels)
Output shape
- if
return_sequences
- if data_format='channels_first' 5D tensor with shape:
(samples, time, filters, output_row, output_col)
- if data_format='channels_last' 5D tensor with shape:
(samples, time, output_row, output_col, filters)
- if data_format='channels_first' 5D tensor with shape:
else
- if data_format='channels_first' 4D tensor with shape:
(samples, filters, output_row, output_col)
- if data_format='channels_last' 4D tensor with shape:
(samples, output_row, output_col, filters)
where o_row and o_col depend on the shape of the filter andthe padding
- if data_format='channels_first' 4D tensor with shape:
Raises
- ValueError: in case of invalid constructor arguments.
References
- Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting The current implementation does not include the feedback loop on the cells output
ConvLSTM2DCell
keras.layers.ConvLSTM2DCell(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, dilation_rate=(1, 1), activation='tanh', recurrent_activation='hard_sigmoid', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0)
Cell class for the ConvLSTM2D layer.
Arguments
- filters: Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution).
- kernel_size: An integer or tuple/list of n integers, specifying the dimensions of the convolution window.
- strides: An integer or tuple/list of n integers, specifying the strides of the convolution. Specifying any stride value != 1 is incompatible with specifying any
dilation_rate
value != 1. - padding: One of
"valid"
or"same"
(case-insensitive). - data_format: A string, one of
"channels_last"
(default) or"channels_first"
. It defaults to theimage_data_format
value found in your Keras config file at~/.keras/keras.json
. If you never set it, then it will be"channels_last"
. - dilation_rate: An integer or tuple/list of n integers, specifying the dilation rate to use for dilated convolution. Currently, specifying any
dilation_rate
value != 1 is incompatible with specifying anystrides
value != 1. - activation: Activation function to use (see activations).
- recurrent_activation: Activation function to use for the recurrent step (see activations).
- use_bias: Boolean, whether the layer uses a bias vector.
- kernel_initializer: Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs. (see initializers). - recurrent_initializer: Initializer for the
recurrent_kernel
weights matrix, used for the linear transformation of the recurrent state. (see initializers). - bias_initializer: Initializer for the bias vector (see initializers).
- unit_forget_bias: Boolean. If True, add 1 to the bias of the forget gate at initialization. Use in combination with
bias_initializer="zeros"
. This is recommended in Jozefowicz et al. (2015). - kernel_regularizer: Regularizer function applied to the
kernel
weights matrix (see regularizer). - recurrent_regularizer: Regularizer function applied to the
recurrent_kernel
weights matrix (see regularizer). - bias_regularizer: Regularizer function applied to the bias vector (see regularizer).
- kernel_constraint: Constraint function applied to the
kernel
weights matrix (see constraints). - recurrent_constraint: Constraint function applied to the
recurrent_kernel
weights matrix (see constraints). - bias_constraint: Constraint function applied to the bias vector (see constraints).
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.
- recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state.
SimpleRNNCell
keras.layers.SimpleRNNCell(units, activation='tanh', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0)
Cell class for SimpleRNN.
Arguments
- units: Positive integer, dimensionality of the output space.
- activation: Activation function to use (see activations). Default: hyperbolic tangent (
tanh
). If you passNone
, no activation is applied (ie. "linear" activation:a(x) = x
). - use_bias: Boolean, whether the layer uses a bias vector.
- kernel_initializer: Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs (see initializers). - recurrent_initializer: Initializer for the
recurrent_kernel
weights matrix, used for the linear transformation of the recurrent state (see initializers). - bias_initializer: Initializer for the bias vector (see initializers).
- kernel_regularizer: Regularizer function applied to the
kernel
weights matrix (see regularizer). - recurrent_regularizer: Regularizer function applied to the
recurrent_kernel
weights matrix (see regularizer). - bias_regularizer: Regularizer function applied to the bias vector (see regularizer).
- kernel_constraint: Constraint function applied to the
kernel
weights matrix (see constraints). - recurrent_constraint: Constraint function applied to the
recurrent_kernel
weights matrix (see constraints). - bias_constraint: Constraint function applied to the bias vector (see constraints).
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.
- recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state.
GRUCell
keras.layers.GRUCell(units, activation='tanh', recurrent_activation='sigmoid', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, implementation=2, reset_after=False)
Cell class for the GRU layer.
Arguments
- units: Positive integer, dimensionality of the output space.
- activation: Activation function to use (see activations). Default: hyperbolic tangent (
tanh
). If you passNone
, no activation is applied (ie. "linear" activation:a(x) = x
). - recurrent_activation: Activation function to use for the recurrent step (see activations). Default: hard sigmoid (
hard_sigmoid
). If you passNone
, no activation is applied (ie. "linear" activation:a(x) = x
). - use_bias: Boolean, whether the layer uses a bias vector.
- kernel_initializer: Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs (see initializers). - recurrent_initializer: Initializer for the
recurrent_kernel
weights matrix, used for the linear transformation of the recurrent state (see initializers). - bias_initializer: Initializer for the bias vector (see initializers).
- kernel_regularizer: Regularizer function applied to the
kernel
weights matrix (see regularizer). - recurrent_regularizer: Regularizer function applied to the
recurrent_kernel
weights matrix (see regularizer). - bias_regularizer: Regularizer function applied to the bias vector (see regularizer).
- kernel_constraint: Constraint function applied to the
kernel
weights matrix (see constraints). - recurrent_constraint: Constraint function applied to the
recurrent_kernel
weights matrix (see constraints). - bias_constraint: Constraint function applied to the bias vector (see constraints).
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.
- recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state.
- implementation: Implementation mode, either 1 or 2. Mode 1 will structure its operations as a larger number of smaller dot products and additions, whereas mode 2 will batch them into fewer, larger operations. These modes will have different performance profiles on different hardware and for different applications.
- reset_after: GRU convention (whether to apply reset gate after or before matrix multiplication). False = "before" (default), True = "after" (CuDNN compatible).
LSTMCell
keras.layers.LSTMCell(units, activation='tanh', recurrent_activation='sigmoid', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, implementation=2)
Cell class for the LSTM layer.
Arguments
- units: Positive integer, dimensionality of the output space.
- activation: Activation function to use (see activations). Default: hyperbolic tangent (
tanh
). If you passNone
, no activation is applied (ie. "linear" activation:a(x) = x
). - recurrent_activation: Activation function to use for the recurrent step (see activations). Default: hard sigmoid (
hard_sigmoid
). If you passNone
, no activation is applied (ie. "linear" activation:a(x) = x
).x - use_bias: Boolean, whether the layer uses a bias vector.
- kernel_initializer: Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs (see initializers). - recurrent_initializer: Initializer for the
recurrent_kernel
weights matrix, used for the linear transformation of the recurrent state (see initializers). - bias_initializer: Initializer for the bias vector (see initializers).
- unit_forget_bias: Boolean. If True, add 1 to the bias of the forget gate at initialization. Setting it to true will also force
bias_initializer="zeros"
. This is recommended in Jozefowicz et al. (2015). - kernel_regularizer: Regularizer function applied to the
kernel
weights matrix (see regularizer). - recurrent_regularizer: Regularizer function applied to the
recurrent_kernel
weights matrix (see regularizer). - bias_regularizer: Regularizer function applied to the bias vector (see regularizer).
- kernel_constraint: Constraint function applied to the
kernel
weights matrix (see constraints). - recurrent_constraint: Constraint function applied to the
recurrent_kernel
weights matrix (see constraints). - bias_constraint: Constraint function applied to the bias vector (see constraints).
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.
- recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state.
- implementation: Implementation mode, either 1 or 2. Mode 1 will structure its operations as a larger number of smaller dot products and additions, whereas mode 2 will batch them into fewer, larger operations. These modes will have different performance profiles on different hardware and for different applications.
CuDNNGRU
keras.layers.CuDNNGRU(units, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, return_sequences=False, return_state=False, stateful=False)
Fast GRU implementation backed by CuDNN.
Can only be run on GPU, with the TensorFlow backend.
Arguments
- units: Positive integer, dimensionality of the output space.
- kernel_initializer: Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs. (see initializers). - recurrent_initializer: Initializer for the
recurrent_kernel
weights matrix, used for the linear transformation of the recurrent state. (see initializers). - bias_initializer: Initializer for the bias vector (see initializers).
- kernel_regularizer: Regularizer function applied to the
kernel
weights matrix (see regularizer). - recurrent_regularizer: Regularizer function applied to the
recurrent_kernel
weights matrix (see regularizer). - bias_regularizer: Regularizer function applied to the bias vector (see regularizer).
- activity_regularizer: Regularizer function applied to the output of the layer (its "activation"). (see regularizer).
- kernel_constraint: Constraint function applied to the
kernel
weights matrix (see constraints). - recurrent_constraint: Constraint function applied to the
recurrent_kernel
weights matrix (see constraints). - bias_constraint: Constraint function applied to the bias vector (see constraints).
- return_sequences: Boolean. Whether to return the last output. in the output sequence, or the full sequence.
- return_state: Boolean. Whether to return the last state in addition to the output.
- stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.
CuDNNLSTM
keras.layers.CuDNNLSTM(units, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, return_sequences=False, return_state=False, stateful=False)
Fast LSTM implementation with CuDNN.
Can only be run on GPU, with the TensorFlow backend.
Arguments
- units: Positive integer, dimensionality of the output space.
- kernel_initializer: Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs. (see initializers). - recurrent_initializer: Initializer for the
recurrent_kernel
weights matrix, used for the linear transformation of the recurrent state. (see initializers). - bias_initializer: Initializer for the bias vector (see initializers).
- unit_forget_bias: Boolean. If True, add 1 to the bias of the forget gate at initialization. Setting it to true will also force
bias_initializer="zeros"
. This is recommended in Jozefowicz et al. (2015). - kernel_regularizer: Regularizer function applied to the
kernel
weights matrix (see regularizer). - recurrent_regularizer: Regularizer function applied to the
recurrent_kernel
weights matrix (see regularizer). - bias_regularizer: Regularizer function applied to the bias vector (see regularizer).
- activity_regularizer: Regularizer function applied to the output of the layer (its "activation"). (see regularizer).
- kernel_constraint: Constraint function applied to the
kernel
weights matrix (see constraints). - recurrent_constraint: Constraint function applied to the
recurrent_kernel
weights matrix (see constraints). - bias_constraint: Constraint function applied to the bias vector (see constraints).
- return_sequences: Boolean. Whether to return the last output. in the output sequence, or the full sequence.
- return_state: Boolean. Whether to return the last state in addition to the output.
- stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.