Deep learning for beginners (Part 8): Improving our tuning script & using the Keras tuner

This is the eighth and final part in the Deep Learning for Beginners series. The other parts of the series are:

  1. Deep learning for beginners (Part 1): neurons & activation functions
  2. Deep learning for beginners (Part 2): some key terminology and concepts of neural networks
  3. Deep learning for beginners (Part 3): implementing our first Multi Layer Perceptron (MLP) model
  4. Deep learning for beginners (Part 4): inspecting our Multi Layer Perceptron (MLP) model
  5. Deep learning for beginners (Part 5): our first foray into Keras
  6. Deep learning for beginners (Part 6): more terminology to optimise our Keras model
  7. Deep learning for beginners (Part 7): neural network design (layers & neurons)
  8. Deep learning for beginners (Part 8): Improving our tuning script & using the Keras tuner

In part 7, we talked about how we could iterate over different configuations and find the optimal setting for the network. This was a function we created ourselves to test different numbers of neurons, activation functions, etc.. to determine the optimal configuration. This was for a neural network with a single hidden layer. So, I’ve updated that in the below:

import tensorflow as tf
import numpy as np
import pandas as pd
from tensorflow import keras
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split
import random

df = pd.read_csv('/home/Datasets/creditcard.csv')

output = df['Class']
features = df.drop('Class', 1)

train_features, test_features, train_labels, test_labels = train_test_split(df, output, test_size = 0.2, random_state = 42)

# ds_te = tf.data.Dataset.from_tensor_slices((dict(test_features), test_labels))
# ds_tr = tf.data.Dataset.from_tensor_slices((dict(train_features), train_labels))
train_features = tf.convert_to_tensor(train_features)
test_features = tf.convert_to_tensor(test_features)
train_labels = tf.convert_to_tensor(train_labels)
test_labels = tf.convert_to_tensor(test_labels)

out = []

num_nodes = [[4, 5, 6], [2, 3, 4]] # each nested list represents the options for one layer
act_functions = [tf.nn.relu]
optimizers = ['SGD']
loss_functions = ['categorical_crossentropy']
epochs_count = ['10']
batch_sizes = ['500']

rounds = 1

while rounds <=4:
    model = tf.keras.Sequential()
    act = random.choice(act_functions)
    opt = random.choice(optimizers)
    ep = random.choice(epochs_count)
    batch = random.choice(batch_sizes)
    loss = random.choice(loss_functions)     
    
    model.add(tf.keras.layers.Dense(31, activation = act, input_shape=(31,)))  

    i = 0
    for x in num_nodes:
        count = random.choice(num_nodes[i])
        model.add(tf.keras.layers.Dense(count, activation = act))
        i = i + 1
        
    model.add(tf.keras.layers.Dense(1, activation = 'sigmoid'))
    model.compile(loss = loss,
             optimizer = opt,
             metrics = ['accuracy'])

    epochs = int(ep)
    batch_size = int(batch)
    model.fit(train_features, train_labels, epochs=epochs, batch_size=batch_size)
    acc = model.history.history['accuracy']
    loss = model.history.history['loss']

    out.append([count, act, acc, loss, ep, opt, batch])
    rounds = rounds + 1
                        
columns = ['num_nodes', 'activation_func', 'accuracy_per_epoch', 'loss', 'epochs', 'opt', 'batch']
df = pd.DataFrame(out)
df.columns = columns

def split_epochs(row):
    epochs = row['accuracy_per_epoch']
    list_of_floats = [float(item) for item in epochs]
    return  max(epochs)

df['max_epoch_acc'] = df.apply(split_epochs, axis=1)

We can see in the above, I have defined 2 hidden layers with [[4, 5, 6], [2, 3, 4]]. So, we want to iterate through & find out which combination of node counts in each layer is best. When we look at the model summary, we can see that we now do indeed have 4 layers (input, 2 x hidden layers, output).

model.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_14 (Dense)             (None, 31)                992       
_________________________________________________________________
dense_15 (Dense)             (None, 6)                 192       
_________________________________________________________________
dense_16 (Dense)             (None, 3)                 21        
_________________________________________________________________
dense_17 (Dense)             (None, 1)                 4         
=================================================================
Total params: 1,209
Trainable params: 1,209
Non-trainable params: 0
_________________________________________________________________

There is also an ‘out of the box’ tool called the Keras tuner. As below, it’s quite simple to use – although less complete and flexible than the option above in my opinion.

import keras_tuner as kt
import tensorflow as tf
import numpy as np
import pandas as pd
from tensorflow import keras
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split

df = pd.read_csv('/home/Datasets/creditcard.csv')

output = df['Class']
features = df.drop('Class', 1)

train_features, test_features, train_labels, test_labels = train_test_split(df, output, test_size = 0.2, random_state = 42)

train_features = train_features.to_numpy()
test_features = test_features.to_numpy()
train_labels = train_labels.to_numpy()
test_labels = test_labels.to_numpy()
 
def build_model(hp):
  model = keras.Sequential()
  model.add(keras.layers.Dense(
      hp.Choice('units', [8, 16, 32]),
      hp.Choice('activation', ['relu', 'sigmoid'])))
  model.add(keras.layers.Dense(1, activation='relu'))
  model.compile(loss='mse')
  return model

#INITIALIZE TUNER - OBJECTIVE TO FIND BEST MODEL. TRIALS = MODELS TO TRY
tuner = kt.RandomSearch(
    build_model,
    objective='val_loss',
    max_trials=5)

#RUN THE TUNER & RETURN THE BEST MODEL
tuner.search(train_features, train_labels, epochs=5, validation_data=(test_features, test_labels))
best_model = tuner.get_best_models()[0]

# CITING KERAS
# @misc{omalley2019kerastuner,
#     title        = {KerasTuner},
#     author       = {O'Malley, Tom and Bursztein, Elie and Long, James and Chollet, Fran\c{c}ois and Jin, Haifeng and Invernizzi, Luca and others},
#     year         = 2019,
#     howpublished = {\url{https://github.com/keras-team/keras-tuner}}
# }

It gives an output like the below, which is quite visually pleasing & helpful, but again, doesn’t give as much depth as the dataframe we output as a result of our above script.

So it’s quite nice and easy to tune our model. It can be quite computationally intensive, but it’s a more favorable option than manually checking all the parameters!

Kodey