6-1 Three Ways of Modeling

There are three ways of modeling: using Sequential to construct model with the order of layers, using functional APIs to construct model with arbitrary structure, using child class inheriting from the base class Model.

For the models with sequenced structure, Sequential method should be given the highest priority.

For the models with nonsequenced structures such as multiple input/output, shared weights, or residual connections, modeling with functional API is recommended.

Modeling through child class of Model should be AVOIDED unless with special requirements. This method is flexible, but also fallible.

Here is are the examples of modeling using the three above-mentioned methods to classify IMDB movie reviews.

  1. import numpy as np
  2. import pandas as pd
  3. import tensorflow as tf
  4. from tqdm import tqdm
  5. from tensorflow.keras import *
  6. train_token_path = "../data/imdb/train_token.csv"
  7. test_token_path = "../data/imdb/test_token.csv"
  8. MAX_WORDS = 10000 # We will only consider the top 10,000 words in the dataset
  9. MAX_LEN = 200 # We will cut reviews after 200 words
  10. BATCH_SIZE = 20
  11. # Constructing data pipeline
  12. def parse_line(line):
  13. t = tf.strings.split(line,"\t")
  14. label = tf.reshape(tf.cast(tf.strings.to_number(t[0]),tf.int32),(-1,))
  15. features = tf.cast(tf.strings.to_number(tf.strings.split(t[1]," ")),tf.int32)
  16. return (features,label)
  17. ds_train= tf.data.TextLineDataset(filenames = [train_token_path]) \
  18. .map(parse_line,num_parallel_calls = tf.data.experimental.AUTOTUNE) \
  19. .shuffle(buffer_size = 1000).batch(BATCH_SIZE) \
  20. .prefetch(tf.data.experimental.AUTOTUNE)
  21. ds_test= tf.data.TextLineDataset(filenames = [test_token_path]) \
  22. .map(parse_line,num_parallel_calls = tf.data.experimental.AUTOTUNE) \
  23. .shuffle(buffer_size = 1000).batch(BATCH_SIZE) \
  24. .prefetch(tf.data.experimental.AUTOTUNE)

1. Modeling Using Sequential

  1. tf.keras.backend.clear_session()
  2. model = models.Sequential()
  3. model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
  4. model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
  5. model.add(layers.MaxPool1D(2))
  6. model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
  7. model.add(layers.MaxPool1D(2))
  8. model.add(layers.Flatten())
  9. model.add(layers.Dense(1,activation = "sigmoid"))
  10. model.compile(optimizer='Nadam',
  11. loss='binary_crossentropy',
  12. metrics=['accuracy',"AUC"])
  13. model.summary()

6-1 Three Ways of Modeling - 图1

  1. import datetime
  2. baselogger = callbacks.BaseLogger(stateful_metrics=["AUC"])
  3. logdir = "../data/keras_model/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
  4. tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
  5. history = model.fit(ds_train,validation_data = ds_test,
  6. epochs = 6,callbacks=[baselogger,tensorboard_callback])
  1. %matplotlib inline
  2. %config InlineBackend.figure_format = 'svg'
  3. import matplotlib.pyplot as plt
  4. def plot_metric(history, metric):
  5. train_metrics = history.history[metric]
  6. val_metrics = history.history['val_'+metric]
  7. epochs = range(1, len(train_metrics) + 1)
  8. plt.plot(epochs, train_metrics, 'bo--')
  9. plt.plot(epochs, val_metrics, 'ro-')
  10. plt.title('Training and validation '+ metric)
  11. plt.xlabel("Epochs")
  12. plt.ylabel(metric)
  13. plt.legend(["train_"+metric, 'val_'+metric])
  14. plt.show()
  1. plot_metric(history,"AUC")

6-1 Three Ways of Modeling - 图2

2. Modeling Using Functional API

  1. tf.keras.backend.clear_session()
  2. inputs = layers.Input(shape=[MAX_LEN])
  3. x = layers.Embedding(MAX_WORDS,7)(inputs)
  4. branch1 = layers.SeparableConv1D(64,3,activation="relu")(x)
  5. branch1 = layers.MaxPool1D(3)(branch1)
  6. branch1 = layers.SeparableConv1D(32,3,activation="relu")(branch1)
  7. branch1 = layers.GlobalMaxPool1D()(branch1)
  8. branch2 = layers.SeparableConv1D(64,5,activation="relu")(x)
  9. branch2 = layers.MaxPool1D(5)(branch2)
  10. branch2 = layers.SeparableConv1D(32,5,activation="relu")(branch2)
  11. branch2 = layers.GlobalMaxPool1D()(branch2)
  12. branch3 = layers.SeparableConv1D(64,7,activation="relu")(x)
  13. branch3 = layers.MaxPool1D(7)(branch3)
  14. branch3 = layers.SeparableConv1D(32,7,activation="relu")(branch3)
  15. branch3 = layers.GlobalMaxPool1D()(branch3)
  16. concat = layers.Concatenate()([branch1,branch2,branch3])
  17. outputs = layers.Dense(1,activation = "sigmoid")(concat)
  18. model = models.Model(inputs = inputs,outputs = outputs)
  19. model.compile(optimizer='Nadam',
  20. loss='binary_crossentropy',
  21. metrics=['accuracy',"AUC"])
  22. model.summary()
  1. Model: "model"
  2. __________________________________________________________________________________________________
  3. Layer (type) Output Shape Param # Connected to
  4. ==================================================================================================
  5. input_1 (InputLayer) [(None, 200)] 0
  6. __________________________________________________________________________________________________
  7. embedding (Embedding) (None, 200, 7) 70000 input_1[0][0]
  8. __________________________________________________________________________________________________
  9. separable_conv1d (SeparableConv (None, 198, 64) 533 embedding[0][0]
  10. __________________________________________________________________________________________________
  11. separable_conv1d_2 (SeparableCo (None, 196, 64) 547 embedding[0][0]
  12. __________________________________________________________________________________________________
  13. separable_conv1d_4 (SeparableCo (None, 194, 64) 561 embedding[0][0]
  14. __________________________________________________________________________________________________
  15. max_pooling1d (MaxPooling1D) (None, 66, 64) 0 separable_conv1d[0][0]
  16. __________________________________________________________________________________________________
  17. max_pooling1d_1 (MaxPooling1D) (None, 39, 64) 0 separable_conv1d_2[0][0]
  18. __________________________________________________________________________________________________
  19. max_pooling1d_2 (MaxPooling1D) (None, 27, 64) 0 separable_conv1d_4[0][0]
  20. __________________________________________________________________________________________________
  21. separable_conv1d_1 (SeparableCo (None, 64, 32) 2272 max_pooling1d[0][0]
  22. __________________________________________________________________________________________________
  23. separable_conv1d_3 (SeparableCo (None, 35, 32) 2400 max_pooling1d_1[0][0]
  24. __________________________________________________________________________________________________
  25. separable_conv1d_5 (SeparableCo (None, 21, 32) 2528 max_pooling1d_2[0][0]
  26. __________________________________________________________________________________________________
  27. global_max_pooling1d (GlobalMax (None, 32) 0 separable_conv1d_1[0][0]
  28. __________________________________________________________________________________________________
  29. global_max_pooling1d_1 (GlobalM (None, 32) 0 separable_conv1d_3[0][0]
  30. __________________________________________________________________________________________________
  31. global_max_pooling1d_2 (GlobalM (None, 32) 0 separable_conv1d_5[0][0]
  32. __________________________________________________________________________________________________
  33. concatenate (Concatenate) (None, 96) 0 global_max_pooling1d[0][0]
  34. global_max_pooling1d_1[0][0]
  35. global_max_pooling1d_2[0][0]
  36. __________________________________________________________________________________________________
  37. dense (Dense) (None, 1) 97 concatenate[0][0]
  38. ==================================================================================================
  39. Total params: 78,938
  40. Trainable params: 78,938
  41. Non-trainable params: 0
  42. __________________________________________________________________________________________________

6-1 Three Ways of Modeling - 图3

  1. import datetime
  2. logdir = "../data/keras_model/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
  3. tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
  4. history = model.fit(ds_train,validation_data = ds_test,epochs = 6,callbacks=[tensorboard_callback])
  1. Epoch 1/6
  2. 1000/1000 [==============================] - 32s 32ms/step - loss: 0.5527 - accuracy: 0.6758 - AUC: 0.7731 - val_loss: 0.3646 - val_accuracy: 0.8426 - val_AUC: 0.9192
  3. Epoch 2/6
  4. 1000/1000 [==============================] - 24s 24ms/step - loss: 0.3024 - accuracy: 0.8737 - AUC: 0.9444 - val_loss: 0.3281 - val_accuracy: 0.8644 - val_AUC: 0.9350
  5. Epoch 3/6
  6. 1000/1000 [==============================] - 24s 24ms/step - loss: 0.2158 - accuracy: 0.9159 - AUC: 0.9715 - val_loss: 0.3461 - val_accuracy: 0.8666 - val_AUC: 0.9363
  7. Epoch 4/6
  8. 1000/1000 [==============================] - 24s 24ms/step - loss: 0.1492 - accuracy: 0.9464 - AUC: 0.9859 - val_loss: 0.4017 - val_accuracy: 0.8568 - val_AUC: 0.9311
  9. Epoch 5/6
  10. 1000/1000 [==============================] - 24s 24ms/step - loss: 0.0944 - accuracy: 0.9696 - AUC: 0.9939 - val_loss: 0.4998 - val_accuracy: 0.8550 - val_AUC: 0.9233
  11. Epoch 6/6
  12. 1000/1000 [==============================] - 26s 26ms/step - loss: 0.0526 - accuracy: 0.9865 - AUC: 0.9977 - val_loss: 0.6463 - val_accuracy: 0.8462 - val_AUC: 0.9138
  1. plot_metric(history,"AUC")

6-1 Three Ways of Modeling - 图4

3. Customized Modeling Using Child Class of Model

  1. # Define a customized residual module as Layer
  2. class ResBlock(layers.Layer):
  3. def __init__(self, kernel_size, **kwargs):
  4. super(ResBlock, self).__init__(**kwargs)
  5. self.kernel_size = kernel_size
  6. def build(self,input_shape):
  7. self.conv1 = layers.Conv1D(filters=64,kernel_size=self.kernel_size,
  8. activation = "relu",padding="same")
  9. self.conv2 = layers.Conv1D(filters=32,kernel_size=self.kernel_size,
  10. activation = "relu",padding="same")
  11. self.conv3 = layers.Conv1D(filters=input_shape[-1],
  12. kernel_size=self.kernel_size,activation = "relu",padding="same")
  13. self.maxpool = layers.MaxPool1D(2)
  14. super(ResBlock,self).build(input_shape) # Identical to self.built = True
  15. def call(self, inputs):
  16. x = self.conv1(inputs)
  17. x = self.conv2(x)
  18. x = self.conv3(x)
  19. x = layers.Add()([inputs,x])
  20. x = self.maxpool(x)
  21. return x
  22. # Need to define get_config method in order to sequentialize the model constructed from the customized Layer by Functional API.
  23. def get_config(self):
  24. config = super(ResBlock, self).get_config()
  25. config.update({'kernel_size': self.kernel_size})
  26. return config
  1. # Test ResBlock
  2. resblock = ResBlock(kernel_size = 3)
  3. resblock.build(input_shape = (None,200,7))
  4. resblock.compute_output_shape(input_shape=(None,200,7))
  1. TensorShape([None, 100, 7])
  1. # Customized model, which could also be implemented by Sequential or Functional API
  2. class ImdbModel(models.Model):
  3. def __init__(self):
  4. super(ImdbModel, self).__init__()
  5. def build(self,input_shape):
  6. self.embedding = layers.Embedding(MAX_WORDS,7)
  7. self.block1 = ResBlock(7)
  8. self.block2 = ResBlock(5)
  9. self.dense = layers.Dense(1,activation = "sigmoid")
  10. super(ImdbModel,self).build(input_shape)
  11. def call(self, x):
  12. x = self.embedding(x)
  13. x = self.block1(x)
  14. x = self.block2(x)
  15. x = layers.Flatten()(x)
  16. x = self.dense(x)
  17. return(x)
  1. tf.keras.backend.clear_session()
  2. model = ImdbModel()
  3. model.build(input_shape =(None,200))
  4. model.summary()
  5. model.compile(optimizer='Nadam',
  6. loss='binary_crossentropy',
  7. metrics=['accuracy',"AUC"])
  1. Model: "imdb_model"
  2. _________________________________________________________________
  3. Layer (type) Output Shape Param #
  4. =================================================================
  5. embedding (Embedding) multiple 70000
  6. _________________________________________________________________
  7. res_block (ResBlock) multiple 19143
  8. _________________________________________________________________
  9. res_block_1 (ResBlock) multiple 13703
  10. _________________________________________________________________
  11. dense (Dense) multiple 351
  12. =================================================================
  13. Total params: 103,197
  14. Trainable params: 103,197
  15. Non-trainable params: 0
  16. _________________________________________________________________

6-1 Three Ways of Modeling - 图5

  1. import datetime
  2. logdir = "../tflogs/keras_model/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
  3. tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
  4. history = model.fit(ds_train,validation_data = ds_test,
  5. epochs = 6,callbacks=[tensorboard_callback])
  1. Epoch 1/6
  2. 1000/1000 [==============================] - 47s 47ms/step - loss: 0.5629 - accuracy: 0.6618 - AUC: 0.7548 - val_loss: 0.3422 - val_accuracy: 0.8510 - val_AUC: 0.9286
  3. Epoch 2/6
  4. 1000/1000 [==============================] - 43s 43ms/step - loss: 0.2648 - accuracy: 0.8903 - AUC: 0.9576 - val_loss: 0.3276 - val_accuracy: 0.8650 - val_AUC: 0.9410
  5. Epoch 3/6
  6. 1000/1000 [==============================] - 42s 42ms/step - loss: 0.1573 - accuracy: 0.9439 - AUC: 0.9846 - val_loss: 0.3861 - val_accuracy: 0.8682 - val_AUC: 0.9390
  7. Epoch 4/6
  8. 1000/1000 [==============================] - 42s 42ms/step - loss: 0.0849 - accuracy: 0.9706 - AUC: 0.9950 - val_loss: 0.5324 - val_accuracy: 0.8616 - val_AUC: 0.9292
  9. Epoch 5/6
  10. 1000/1000 [==============================] - 43s 43ms/step - loss: 0.0393 - accuracy: 0.9876 - AUC: 0.9986 - val_loss: 0.7693 - val_accuracy: 0.8566 - val_AUC: 0.9132
  11. Epoch 6/6
  12. 1000/1000 [==============================] - 44s 44ms/step - loss: 0.0222 - accuracy: 0.9926 - AUC: 0.9994 - val_loss: 0.9328 - val_accuracy: 0.8584 - val_AUC: 0.9052
  1. plot_metric(history,"AUC")

6-1 Three Ways of Modeling - 图6

Please leave comments in the WeChat official account “Python与算法之美” (Elegance of Python and Algorithms) if you want to communicate with the author about the content. The author will try best to reply given the limited time available.

You are also welcomed to join the group chat with the other readers through replying 加群 (join group) in the WeChat official account.

image.png