This is the eighth and final part in the Deep Learning for Beginners series. The other parts of the series are:
- Deep learning for beginners (Part 1): neurons & activation functions
- Deep learning for beginners (Part 2): some key terminology and concepts of neural networks
- Deep learning for beginners (Part 3): implementing our first Multi Layer Perceptron (MLP) model
- Deep learning for beginners (Part 4): inspecting our Multi Layer Perceptron (MLP) model
- Deep learning for beginners (Part 5): our first foray into Keras
- Deep learning for beginners (Part 6): more terminology to optimise our Keras model
- Deep learning for beginners (Part 7): neural network design (layers & neurons)
- Deep learning for beginners (Part 8): Improving our tuning script & using the Keras tuner
In part 7, we talked about how we could iterate over different configuations and find the optimal setting for the network. This was a function we created ourselves to test different numbers of neurons, activation functions, etc.. to determine the optimal configuration. This was for a neural network with a single hidden layer. So, I’ve updated that in the below:
import tensorflow as tf
import numpy as np
import pandas as pd
from tensorflow import keras
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split
import random
df = pd.read_csv('/home/Datasets/creditcard.csv')
output = df['Class']
features = df.drop('Class', 1)
train_features, test_features, train_labels, test_labels = train_test_split(df, output, test_size = 0.2, random_state = 42)
# ds_te = tf.data.Dataset.from_tensor_slices((dict(test_features), test_labels))
# ds_tr = tf.data.Dataset.from_tensor_slices((dict(train_features), train_labels))
train_features = tf.convert_to_tensor(train_features)
test_features = tf.convert_to_tensor(test_features)
train_labels = tf.convert_to_tensor(train_labels)
test_labels = tf.convert_to_tensor(test_labels)
out = []
num_nodes = [[4, 5, 6], [2, 3, 4]] # each nested list represents the options for one layer
act_functions = [tf.nn.relu]
optimizers = ['SGD']
loss_functions = ['categorical_crossentropy']
epochs_count = ['10']
batch_sizes = ['500']
rounds = 1
while rounds <=4:
model = tf.keras.Sequential()
act = random.choice(act_functions)
opt = random.choice(optimizers)
ep = random.choice(epochs_count)
batch = random.choice(batch_sizes)
loss = random.choice(loss_functions)
model.add(tf.keras.layers.Dense(31, activation = act, input_shape=(31,)))
i = 0
for x in num_nodes:
count = random.choice(num_nodes[i])
model.add(tf.keras.layers.Dense(count, activation = act))
i = i + 1
model.add(tf.keras.layers.Dense(1, activation = 'sigmoid'))
model.compile(loss = loss,
optimizer = opt,
metrics = ['accuracy'])
epochs = int(ep)
batch_size = int(batch)
model.fit(train_features, train_labels, epochs=epochs, batch_size=batch_size)
acc = model.history.history['accuracy']
loss = model.history.history['loss']
out.append([count, act, acc, loss, ep, opt, batch])
rounds = rounds + 1
columns = ['num_nodes', 'activation_func', 'accuracy_per_epoch', 'loss', 'epochs', 'opt', 'batch']
df = pd.DataFrame(out)
df.columns = columns
def split_epochs(row):
epochs = row['accuracy_per_epoch']
list_of_floats = [float(item) for item in epochs]
return max(epochs)
df['max_epoch_acc'] = df.apply(split_epochs, axis=1)
We can see in the above, I have defined 2 hidden layers with [[4, 5, 6], [2, 3, 4]]. So, we want to iterate through & find out which combination of node counts in each layer is best. When we look at the model summary, we can see that we now do indeed have 4 layers (input, 2 x hidden layers, output).
model.summary()
Model: "sequential_5"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_14 (Dense) (None, 31) 992
_________________________________________________________________
dense_15 (Dense) (None, 6) 192
_________________________________________________________________
dense_16 (Dense) (None, 3) 21
_________________________________________________________________
dense_17 (Dense) (None, 1) 4
=================================================================
Total params: 1,209
Trainable params: 1,209
Non-trainable params: 0
_________________________________________________________________
There is also an ‘out of the box’ tool called the Keras tuner. As below, it’s quite simple to use – although less complete and flexible than the option above in my opinion.
import keras_tuner as kt
import tensorflow as tf
import numpy as np
import pandas as pd
from tensorflow import keras
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split
df = pd.read_csv('/home/Datasets/creditcard.csv')
output = df['Class']
features = df.drop('Class', 1)
train_features, test_features, train_labels, test_labels = train_test_split(df, output, test_size = 0.2, random_state = 42)
train_features = train_features.to_numpy()
test_features = test_features.to_numpy()
train_labels = train_labels.to_numpy()
test_labels = test_labels.to_numpy()
def build_model(hp):
model = keras.Sequential()
model.add(keras.layers.Dense(
hp.Choice('units', [8, 16, 32]),
hp.Choice('activation', ['relu', 'sigmoid'])))
model.add(keras.layers.Dense(1, activation='relu'))
model.compile(loss='mse')
return model
#INITIALIZE TUNER - OBJECTIVE TO FIND BEST MODEL. TRIALS = MODELS TO TRY
tuner = kt.RandomSearch(
build_model,
objective='val_loss',
max_trials=5)
#RUN THE TUNER & RETURN THE BEST MODEL
tuner.search(train_features, train_labels, epochs=5, validation_data=(test_features, test_labels))
best_model = tuner.get_best_models()[0]
# CITING KERAS
# @misc{omalley2019kerastuner,
# title = {KerasTuner},
# author = {O'Malley, Tom and Bursztein, Elie and Long, James and Chollet, Fran\c{c}ois and Jin, Haifeng and Invernizzi, Luca and others},
# year = 2019,
# howpublished = {\url{https://github.com/keras-team/keras-tuner}}
# }
It gives an output like the below, which is quite visually pleasing & helpful, but again, doesn’t give as much depth as the dataframe we output as a result of our above script.

So it’s quite nice and easy to tune our model. It can be quite computationally intensive, but it’s a more favorable option than manually checking all the parameters!