{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Moderne Methoden der Datenanalyse SS2023\n", "# Practical Exercise 10: Deep Learning" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The classification of handwritten digits is a standard problem in the field of image classification. In this exercise, we will process labeled images of handwritten digits from the MNIST dataset in order to train and test (deep) neural networks on this task. The goal of this exercise is for you to dive into a state-of-the-art software package for large-scale deep learning, to experiment with your own neural-network designs, and to compare your results with modern setups.\n", "\n", "TensorFlow is one of the most popular and powerful tools in the machine learning community and can be used to build, train, and execute large-scale machine-learning models. The core concept of TensorFlow is the representation of the information flow as tensors in a graph. In this exercise, the wrapper Keras is used, which hides this concept to a large extent and makes the library much easier to use.\n", "\n", "The exercise is shipped with a script for the download of relevant data and one notebook. All needed software such as TensorFlow (www.tensorflow.org) and Keras (www.keras.io) is already installed on the Jupyter Machine. \n", "\n", "
\"MNIST_example\"
\n", "
Fig.1: Example images from the MNIST dataset
\n", "
\n", "\n", "\n", "The MNIST dataset (https://yann.lecun.com/exdb/mnist) contains a total of 70000 images of handwritten digits. The images are in greyscale with 28 x 28 pixels each (see Fig.1 for some examples). Execute the script `download_dataset.py` to download and extract the dataset as binary. The script also converts some example images from the binary dataset as `png` files (greyscale-inverted images of Fig.1). Have a look at the example images and at the code.\n", "\n", "In preparation of the training ([Exercise 10.1](#exercise101_TNT)), read the code in the function [`train()`](#train_me) in the Jupyter notebook and identify the part, where the machine learning model is defined. The model contains many components of modern architectures, e.g., convolutional, dense, and maximum pooling layers. Also specified in the code are the loss function, the optimizer, and the validation metrics used for the training. The full documentation is available on the Keras webpage (www.keras.io)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Exercise 10.1: Training and Testing (obligatory)\n", "Now we set up a model, train it, and test its performance.\n", "\n", "- Explain the meaning and function of the various parts of the example model and understand how the total number of trainable parameters, 1765, comes about. \n", " *Hint*: Look at the output of the code line `model.summary()` and keep in mind that there are bias terms. \n", "\n", " What would be the number of parameters for a convolutional (dense) neural network with one hidden layer of n nodes and 10 outputs?\n", "\n", "- Now, modify the code below ([`train()`](#train_me)) and try to achieve the best global accuracy. *Hint*: You will need to increase the model capacity, e.g., larger number of convolution filters or additional dense layers, and the number of epochs. With increasing model capacity, you will quickly understand why GPUs play such a big role in machine learning.\n", "\n", " What would be a good method to evaluate the optimal number of training epochs? Plot the accuracy of the model on training and validation sets as a function of training epochs.\n", " \n", " Can you explain the worse training accuracy compared to the validation accuracy?\n", " \n", "- For your trained model, produce an estimate of your achieved accuracy by running [`apply()`](#apply_and_test) manually on about twenty images, i.e., `example_input_*.png`. \n", "\n", " How does your estimate compare with the result on the test dataset computed with the script [`test()`](#apply_and_test) ? \n", " Do the results match?\n", " \n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Importing usual python packages\n", "import sys, glob, png\n", "import numpy as np\n", "import struct\n", "import matplotlib\n", "#matplotlib.use('Agg') # use this line to change the matplotlib backend\n", "from matplotlib import pyplot as plt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# %%capture output\n", "\n", "# The following imports might cause a warning message which informs you\n", "# that the machine on which the notebook is running does not provide GPUs.\n", "# As these are not necessary, you can ignore this warning. If you want to\n", "# suppress this warning, you cen comment out the line \"%%capture outpute\"\n", "# above.\n", "\n", "# Importing ML related packages. \n", "import tensorflow\n", "from tensorflow.keras.models import Sequential, load_model\n", "from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten\n", "from tensorflow.keras.layers import Conv2D, MaxPooling2D\n", "from tensorflow.keras.optimizers import Adam\n", "from keras.utils import np_utils" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from download_dataset import load_data, binary_to_png" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "! python3 download_dataset.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# unpack some images from binary format\n", "binary_to_png({'images': 'train_images.bin', 'labels': 'train_labels.bin'}, 20)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# check the tensorflow version we are using. Should be 2.8.0\n", "print(tensorflow.__file__)\n", "print(tensorflow.__version__)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def train():\n", " # Load training data\n", " images, labels = load_data('train_images.bin', 'train_labels.bin')\n", "\n", " # Convert labels from integers to one-hot vectors\n", " labels = np_utils.to_categorical(labels, 10)\n", "\n", " # Set up model\n", " model = Sequential()\n", " model.add(Conv2D(2, (2, 2), kernel_initializer='glorot_normal', input_shape=(28,28,1)))\n", " model.add(Activation('relu'))\n", " model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))\n", " model.add(Flatten())\n", " model.add(Dense(5, kernel_initializer='glorot_normal'))\n", " model.add(Activation('relu'))\n", " model.add(Dropout(0.5))\n", " model.add(Dense(10, kernel_initializer='glorot_uniform'))\n", " model.add(Activation('softmax'))\n", " \n", " # Define loss function, optimizer algorithm and validation metrics\n", " model.compile(\n", " loss='categorical_crossentropy',\n", " optimizer=Adam(),\n", " metrics=['categorical_accuracy'])\n", "\n", " # Print summary of the model\n", " model.summary()\n", "\n", " # Train model\n", " history = model.fit(images, labels, batch_size=128, epochs=10, validation_split=0.25)\n", " #history = model.fit(images, labels, batch_size=128, epochs=20, validation_split=0.25)\n", "\n", " # Get training and validation loss/accuracy values from history\n", " loss_training = history.history['loss']\n", " loss_validation = history.history['val_loss']\n", " accuracy_training = history.history['categorical_accuracy']\n", " accuracy_validation = history.history['val_categorical_accuracy']\n", "\n", " # TODO: Plot the training and validation loss/accuracy vs the number of epochs\n", " #plt.plot(...)\n", " #plt.savefig('loss_vs_epochs.png') \n", " \n", " # Save model to file\n", " model.save('model.hd5')\n", " \n", " return" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "train()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# check if the output model exists\n", "!date\n", "!ls -haltr ./*.hd5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following code blocks are provided to apply the model manually to a list of files, or to test it on the test dataset.\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def apply(png_list):\n", " if len(png_list) < 1:\n", " raise Exception('Please specify at least one PNG image as argument.')\n", "\n", " # Load trained keras model\n", " model = load_model('model.hd5')\n", "\n", " # Get image names from arguments\n", " print('Load images:')\n", " filename_images = []\n", " for arg in png_list[1:]:\n", " print(' {}'.format(arg))\n", " filename_images.append(arg)\n", "\n", " # Load images from files\n", " images = np.zeros((len(filename_images), 28, 28, 1))\n", " for i_file, file_ in enumerate(filename_images):\n", " pngdata = png.Reader(open(file_, 'rb')).asDirect()\n", " for i_row, row in enumerate(pngdata[2]):\n", " images[i_file, i_row, :, 0] = row\n", "\n", " # Predict labels for images\n", " labels = model.predict(images)\n", " numbers = np.argmax(labels, axis=1)\n", " print('Predict labels for images:')\n", " for file_, number in zip(filename_images, numbers):\n", " print(' {} : {}'.format(file_, number))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "example_inputs = glob.glob('example_input_*.png')\n", "apply(example_inputs)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def test():\n", " # Load trained keras model\n", " model = load_model('model.hd5')\n", "\n", " # Load test data\n", " images, labels = load_data('test_images.bin', 'test_labels.bin')\n", "\n", " # Predict written numbers in images\n", " labels_predicted = model.predict(images)\n", "\n", " # Decode the one-hot vectors to labels\n", " labels_decoded = np.argmax(labels_predicted, axis=1)\n", "\n", " # Calculate accuracy of prediction\n", " num_correct = np.sum(labels_decoded == labels)\n", " print('Accuracy on test dataset: {}'.format(float(num_correct)/len(labels)))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "test()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 10.2 (voluntary)\n", "\n", "Use GIMP (or any other graphics editor) to create images of your own handwritten digits and evaluate whether the performance you experience with your own example images matches the performance achieved during training on the MNIST dataset. You need to create greyscale png images with 28 x 28 pixels, the background of the image has to be black and the digits have to be written in white color. The file `your_own_digit.xcf` can be used as template. Does the model classify your images correctly? If not, what can be possible reasons that it does not work as expected?\n", "\n", "Have a look at the website http://yann.lecun.com/exdb/mnist, which holds a leaderboard for the test dataset with the performance of modern (deep) machine learning models. You might want to compare your model with popular computer vision models such as `LeNet-5`, `VGG-16`, `AlexNet` or `Inception-v4` to get an idea of the complexity of modern neural network architectures." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.10" } }, "nbformat": 4, "nbformat_minor": 4 }