Microsoft Cognitive Toolkit (CNTK) – Getting Started Here, we will understand about the installation of CNTK on Windows and on Linux. Moreover, the chapter explains installing CNTK package, steps to install Anaconda, CNTK files, directory structure and CNTK library organisation. Prerequisites In order to install CNTK, we must have Python installed on our computers. You can go to the link and select the latest version for your OS, i.e. Windows and Linux/Unix. For basic tutorial on Python, you can refer to the link . CNTK is supported for Windows as well as Linux so we will walk through both of them. Installing on Windows In order to run CNTK on Windows, we will be using the Anaconda version of Python. We know that, Anaconda is a redistribution of Python. It includes additional packages like Scipy andScikit-learn which are used by CNTK to perform various useful calculations. So, first let see the steps to install Anaconda on your machine − Step 1−First download the setup files from the public website . Step 2 − Once you downloaded the setup files, start the installation and follow the instructions from the link . Step 3 − Once installed, Anaconda will also install some other utilities, which will automatically include all the Anaconda executables in your computer PATH variable. We can manage our Python environment from this prompt, can install packages and run Python scripts. Installing CNTK package Once Anaconda installation is done, you can use the most common way to install the CNTK package through the pip executable by using following command − pip install cntk There are various other methods to install Cognitive Toolkit on your machine. Microsoft has a neat set of documentation that explains the other installation methods in detail. Please follow the link . Installing on Linux Installation of CNTK on Linux is a bit different from its installation on Windows. Here, for Linux we are going to use Anaconda to install CNTK, but instead of a graphical installer for Anaconda, we will be using a terminal-based installer on Linux. Although, the installer will work with almost all Linux distributions, we limited the description to Ubuntu. So, first let see the steps to install Anaconda on your machine − Steps to install Anaconda Step 1 − Before installing Anaconda, make sure that the system is fully up to date. To check, first execute the following two commands inside a terminal − sudo apt update sudo apt upgrade Step 2 − Once the computer is updated, get the URL from the public website for the latest Anaconda installation files. Step 3 − Once URL is copied, open a terminal window and execute the following command − wget -0 anaconda-installer.sh url SHAPE * MERGEFORMAT y f x | } Replace the url placeholder with the URL copied from the Anaconda website. Step 4 − Next, with the help of following command, we can install Anaconda − sh ./anaconda-installer.sh The above command will by default install Anaconda3 inside our home directory. Installing CNTK package Once Anaconda installation is done, you can use the most common way to install the CNTK package through the pip executable by using following command − pip install cntk Examining CNTK files & directory structure Once CNTK is installed as a Python package, we can examine its file and directory structure. It’s at C:UsersAnaconda3Libsite-packagescntk, as shown below in screenshot. Verifying CNTK installation Once CNTK is installed as a Python package, you should verify that CNTK has been installed correctly. From Anaconda command shell, start Python interpreter by entering ipython. Then, import CNTK by entering the following command. import cntk as c Once imported, check its version with the help of following command − print(c.__version__) The interpreter will respond with installed CNTK version. If it doesn’t respond, there will be a problem with the installation. The CNTK library organisation CNTK, a python package technically, is organised into 13 high-level sub-packages and 8 smaller sub-packages. Following table consist of the 10 most frequently used packages: Sr.No Package Name & Description 1 cntk.io Contains functions for reading data. For example: next_minibatch() 2 cntk.layers Contains high-level functions for creating neural networks. For example: Dense() 3 cntk.learners Contains functions for training. For example: sgd() 4 cntk.losses Contains functions to measure training error. For example: squared_error() 5 cntk.metrics Contains functions to measure model error. For example: classificatoin_error 6 cntk.ops Contains low-level functions for creating neural networks. For example: tanh() 7 cntk.random Contains functions to generate random numbers. For example: normal() 8 cntk.train Contains training functions. For example: train_minibatch() 9 cntk.initializer Contains model parameter initializers. For example: normal() and uniform() 10 cntk.variables Contains low-level constructs. For example: Parameter() and Variable()
Category: Machine Learning
CNTK – Sequence Classification In this chapter, we will learn in detail about the sequences in CNTK and its classification. Tensors The concept on which CNTK works is tensor. Basically, CNTK inputs, outputs as well as parameters are organized as tensors, which is often thought of as a generalised matrix. Every tensor has a rank − Tensor of rank 0 is a scalar. Tensor of rank 1 is a vector. Tensor of rank 2 is amatrix. Here, these different dimensions are referred as axes. Static axes and Dynamic axes As the name implies, the static axes have the same length throughout the network’s life. On the other hand, the length of dynamic axes can vary from instance to instance. In fact, their length is typically not known before each minibatch is presented. Dynamic axes are like static axes because they also define a meaningful grouping of the numbers contained in the tensor. Example To make it clearer, let’s see how a minibatch of short video clips is represented in CNTK. Suppose that the resolution of video clips is all 640 * 480. And, also the clips are shot in color which is typically encoded with three channels. It further means that our minibatch has the following − 3 static axes of length 640, 480 and 3 respectively. Two dynamic axes; the length of the video and the minibatch axes. It means that if a minibatch is having 16 videos each of which is 240 frames long, would be represented as 16*240*3*640*480 tensors. Working with sequences in CNTK Let us understand sequences in CNTK by first learning about Long-Short Term Memory Network. Long-Short Term Memory Network (LSTM) Long-short term memory (LSTMs) networks were introduced by Hochreiter & Schmidhuber. It solved the problem of getting a basic recurrent layer to remember things for a long time. The architecture of LSTM is given above in the diagram. As we can see it has input neurons, memory cells, and output neurons. In order to combat the vanishing gradient problem, Long-short term memory networks use an explicit memory cell (stores the previous values) and the following gates − Forget gate − As the name implies, it tells the memory cell to forget the previous values. The memory cell stores the values until the gate i.e. ‘forget gate’ tells it to forget them. Input gate − As name implies, it adds new stuff to the cell. Output gate − As name implies, output gate decides when to pass along the vectors from the cell to the next hidden state. It is very easy to work with sequences in CNTK. Let’s see it with the help of following example − import sys import os from cntk import Trainer, Axis from cntk.io import MinibatchSource, CTFDeserializer, StreamDef, StreamDefs, INFINITELY_REPEAT from cntk.learners import sgd, learning_parameter_schedule_per_sample from cntk import input_variable, cross_entropy_with_softmax, classification_error, sequence from cntk.logging import ProgressPrinter from cntk.layers import Sequential, Embedding, Recurrence, LSTM, Dense def create_reader(path, is_training, input_dim, label_dim): return MinibatchSource(CTFDeserializer(path, StreamDefs( features=StreamDef(field=”x”, shape=input_dim, is_sparse=True), labels=StreamDef(field=”y”, shape=label_dim, is_sparse=False) )), randomize=is_training, max_sweeps=INFINITELY_REPEAT if is_training else 1) def LSTM_sequence_classifier_net(input, num_output_classes, embedding_dim, LSTM_dim, cell_dim): lstm_classifier = Sequential([Embedding(embedding_dim), Recurrence(LSTM(LSTM_dim, cell_dim)), sequence.last, Dense(num_output_classes)]) return lstm_classifier(input) def train_sequence_classifier(): input_dim = 2000 cell_dim = 25 hidden_dim = 25 embedding_dim = 50 num_output_classes = 5 features = sequence.input_variable(shape=input_dim, is_sparse=True) label = input_variable(num_output_classes) classifier_output = LSTM_sequence_classifier_net( features, num_output_classes, embedding_dim, hidden_dim, cell_dim) ce = cross_entropy_with_softmax(classifier_output, label) pe = classification_error(classifier_output, label) rel_path = (“../../../Tests/EndToEndTests/Text/” + “SequenceClassification/Data/Train.ctf”) path = os.path.join(os.path.dirname(os.path.abspath(__file__)), rel_path) reader = create_reader(path, True, input_dim, num_output_classes) input_map = { features: reader.streams.features, label: reader.streams.labels } lr_per_sample = learning_parameter_schedule_per_sample(0.0005) progress_printer = ProgressPrinter(0) trainer = Trainer(classifier_output, (ce, pe), sgd(classifier_output.parameters, lr=lr_per_sample),progress_printer) minibatch_size = 200 for i in range(255): mb = reader.next_minibatch(minibatch_size, input_map=input_map) trainer.train_minibatch(mb) evaluation_average = float(trainer.previous_minibatch_evaluation_average) loss_average = float(trainer.previous_minibatch_loss_average) return evaluation_average, loss_average if __name__ == ”__main__”: error, _ = train_sequence_classifier() print(” error: %f” % error) average since average since examples loss last metric last —————————————————— 1.61 1.61 0.886 0.886 44 1.61 1.6 0.714 0.629 133 1.6 1.59 0.56 0.448 316 1.57 1.55 0.479 0.41 682 1.53 1.5 0.464 0.449 1379 1.46 1.4 0.453 0.441 2813 1.37 1.28 0.45 0.447 5679 1.3 1.23 0.448 0.447 11365 error: 0.333333 The detailed explanation of the above program will be covered in next sections, especially when we will be constructing Recurrent Neural networks.
Microsoft Cognitive Toolkit – Useful Resources The following resources contain additional information on Microsoft Cognitive Toolkit. Please use them to get more in-depth knowledge on this. Useful Links on Microsoft Cognitive Toolkit − Microsoft Cognitive Toolkit, its history and various other terms has been explained in simple language. To enlist your site on this page, please drop an email to [email protected]
Time Series – Data Processing and Visualization Time Series is a sequence of observations indexed in equi-spaced time intervals. Hence, the order and continuity should be maintained in any time series. The dataset we will be using is a multi-variate time series having hourly data for approximately one year, for air quality in a significantly polluted Italian city. The dataset can be downloaded from the link given below − . It is necessary to make sure that − The time series is equally spaced, and There are no redundant values or gaps in it. In case the time series is not continuous, we can upsample or downsample it. Showing df.head() In [122]: import pandas In [123]: df = pandas.read_csv(“AirQualityUCI.csv”, sep = “;”, decimal = “,”) df = df.iloc[ : , 0:14] In [124]: len(df) Out[124]: 9471 In [125]: df.head() Out[125]: For preprocessing the time series, we make sure there are no NaN(NULL) values in the dataset; if there are, we can replace them with either 0 or average or preceding or succeeding values. Replacing is a preferred choice over dropping so that the continuity of the time series is maintained. However, in our dataset the last few values seem to be NULL and hence dropping will not affect the continuity. Dropping NaN(Not-a-Number) In [126]: df.isna().sum() Out[126]: Date 114 Time 114 CO(GT) 114 PT08.S1(CO) 114 NMHC(GT) 114 C6H6(GT) 114 PT08.S2(NMHC) 114 NOx(GT) 114 PT08.S3(NOx) 114 NO2(GT) 114 PT08.S4(NO2) 114 PT08.S5(O3) 114 T 114 RH 114 dtype: int64 In [127]: df = df[df[”Date”].notnull()] In [128]: df.isna().sum() Out[128]: Date 0 Time 0 CO(GT) 0 PT08.S1(CO) 0 NMHC(GT) 0 C6H6(GT) 0 PT08.S2(NMHC) 0 NOx(GT) 0 PT08.S3(NOx) 0 NO2(GT) 0 PT08.S4(NO2) 0 PT08.S5(O3) 0 T 0 RH 0 dtype: int64 Time Series are usually plotted as line graphs against time. For that we will now combine the date and time column and convert it into a datetime object from strings. This can be accomplished using the datetime library. Converting to datetime object In [129]: df[”DateTime”] = (df.Date) + ” ” + (df.Time) print (type(df.DateTime[0])) <class ”str”> In [130]: import datetime df.DateTime = df.DateTime.apply(lambda x: datetime.datetime.strptime(x, ”%d/%m/%Y %H.%M.%S”)) print (type(df.DateTime[0])) <class ”pandas._libs.tslibs.timestamps.Timestamp”> Let us see how some variables like temperature changes with change in time. Showing plots In [131]: df.index = df.DateTime In [132]: import matplotlib.pyplot as plt plt.plot(df[”T”]) Out[132]: [<matplotlib.lines.Line2D at 0x1eaad67f780>] In [208]: plt.plot(df[”C6H6(GT)”]) Out[208]: [<matplotlib.lines.Line2D at 0x1eaaeedff28>] Box-plots are another useful kind of graphs that allow you to condense a lot of information about a dataset into a single graph. It shows the mean, 25% and 75% quartile and outliers of one or multiple variables. In the case when number of outliers is few and is very distant from the mean, we can eliminate the outliers by setting them to mean value or 75% quartile value. Showing Boxplots In [134]: plt.boxplot(df[[”T”,”C6H6(GT)”]].values) Out[134]: {”whiskers”: [<matplotlib.lines.Line2D at 0x1eaac16de80>, <matplotlib.lines.Line2D at 0x1eaac16d908>, <matplotlib.lines.Line2D at 0x1eaac177a58>, <matplotlib.lines.Line2D at 0x1eaac177cf8>], ”caps”: [<matplotlib.lines.Line2D at 0x1eaac16d2b0>, <matplotlib.lines.Line2D at 0x1eaac16d588>, <matplotlib.lines.Line2D at 0x1eaac1a69e8>, <matplotlib.lines.Line2D at 0x1eaac1a64a8>], ”boxes”: [<matplotlib.lines.Line2D at 0x1eaac16dc50>, <matplotlib.lines.Line2D at 0x1eaac1779b0>], ”medians”: [<matplotlib.lines.Line2D at 0x1eaac16d4a8>, <matplotlib.lines.Line2D at 0x1eaac1a6c50>], ”fliers”: [<matplotlib.lines.Line2D at 0x1eaac177dd8>, <matplotlib.lines.Line2D at 0x1eaac1a6c18>],”means”: [] }
Time Series – Variations of ARIMA In the previous chapter, we have now seen how ARIMA model works, and its limitations that it cannot handle seasonal data or multivariate time series and hence, new models were introduced to include these features. A glimpse of these new models is given here − Vector Auto-Regression (VAR) It is a generalized version of auto regression model for multivariate stationary time series. It is characterized by ‘p’ parameter. Vector Moving Average (VMA) It is a generalized version of moving average model for multivariate stationary time series. It is characterized by ‘q’ parameter. Vector Auto Regression Moving Average (VARMA) It is the combination of VAR and VMA and a generalized version of ARMA model for multivariate stationary time series. It is characterized by ‘p’ and ‘q’ parameters. Much like, ARMA is capable of acting like an AR model by setting ‘q’ parameter as 0 and as a MA model by setting ‘p’ parameter as 0, VARMA is also capable of acting like an VAR model by setting ‘q’ parameter as 0 and as a VMA model by setting ‘p’ parameter as 0. In [209]: df_multi = df[[”T”, ”C6H6(GT)”]] split = len(df) – int(0.2*len(df)) train_multi, test_multi = df_multi[0:split], df_multi[split:] In [211]: from statsmodels.tsa.statespace.varmax import VARMAX model = VARMAX(train_multi, order = (2,1)) model_fit = model.fit() c:usersnavekshaappdatalocalprogramspythonpython37libsite-packagesstatsmodelstsastatespacevarmax.py:152: EstimationWarning: Estimation of VARMA(p,q) models is not generically robust, due especially to identification issues. EstimationWarning) c:usersnavekshaappdatalocalprogramspythonpython37libsite-packagesstatsmodelstsabasetsa_model.py:171: ValueWarning: No frequency information was provided, so inferred frequency H will be used. % freq, ValueWarning) c:usersnavekshaappdatalocalprogramspythonpython37libsite-packagesstatsmodelsbasemodel.py:508: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals “Check mle_retvals”, ConvergenceWarning) In [213]: predictions_multi = model_fit.forecast( steps=len(test_multi)) c:usersnavekshaappdatalocalprogramspythonpython37libsite-packagesstatsmodelstsabasetsa_model.py:320: FutureWarning: Creating a DatetimeIndex by passing range endpoints is deprecated. Use `pandas.date_range` instead. freq = base_index.freq) c:usersnavekshaappdatalocalprogramspythonpython37libsite-packagesstatsmodelstsastatespacevarmax.py:152: EstimationWarning: Estimation of VARMA(p,q) models is not generically robust, due especially to identification issues. EstimationWarning) In [231]: plt.plot(train_multi[”T”]) plt.plot(test_multi[”T”]) plt.plot(predictions_multi.iloc[:,0:1], ”–”) plt.show() plt.plot(train_multi[”C6H6(GT)”]) plt.plot(test_multi[”C6H6(GT)”]) plt.plot(predictions_multi.iloc[:,1:2], ”–”) plt.show() The above code shows how VARMA model can be used to model multivariate time series, although this model may not be best suited on our data. VARMA with Exogenous Variables (VARMAX) It is an extension of VARMA model where extra variables called covariates are used to model the primary variable we are interested it. Seasonal Auto Regressive Integrated Moving Average (SARIMA) This is the extension of ARIMA model to deal with seasonal data. It divides the data into seasonal and non-seasonal components and models them in a similar fashion. It is characterized by 7 parameters, for non-seasonal part (p,d,q) parameters same as for ARIMA model and for seasonal part (P,D,Q,m) parameters where ‘m’ is the number of seasonal periods and P,D,Q are similar to parameters of ARIMA model. These parameters can be calibrated using grid search or genetic algorithm. SARIMA with Exogenous Variables (SARIMAX) This is the extension of SARIMA model to include exogenous variables which help us to model the variable we are interested in. It may be useful to do a co-relation analysis on variables before putting them as exogenous variables. In [251]: from scipy.stats.stats import pearsonr x = train_multi[”T”].values y = train_multi[”C6H6(GT)”].values corr , p = pearsonr(x,y) print (”Corelation Coefficient =”, corr,”nP-Value =”,p) Corelation Coefficient = 0.9701173437269858 P-Value = 0.0 Pearson’s Correlation shows a linear relation between 2 variables, to interpret the results, we first look at the p-value, if it is less that 0.05 then the value of coefficient is significant, else the value of coefficient is not significant. For significant p-value, a positive value of correlation coefficient indicates positive correlation, and a negative value indicates a negative correlation. Hence, for our data, ‘temperature’ and ‘C6H6’ seem to have a highly positive correlation. Therefore, we will In [297]: from statsmodels.tsa.statespace.sarimax import SARIMAX model = SARIMAX(x, exog = y, order = (2, 0, 2), seasonal_order = (2, 0, 1, 1), enforce_stationarity=False, enforce_invertibility = False) model_fit = model.fit(disp = False) c:usersnavekshaappdatalocalprogramspythonpython37libsite-packagesstatsmodelsbasemodel.py:508: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals “Check mle_retvals”, ConvergenceWarning) In [298]: y_ = test_multi[”C6H6(GT)”].values predicted = model_fit.predict(exog=y_) test_multi_ = pandas.DataFrame(test) test_multi_[”predictions”] = predicted[0:1871] In [299]: plt.plot(train_multi[”T”]) plt.plot(test_multi_[”T”]) plt.plot(test_multi_.predictions, ”–”) Out[299]: [<matplotlib.lines.Line2D at 0x1eab0191c18>] The predictions here seem to take larger variations now as opposed to univariate ARIMA modelling. Needless to say, SARIMAX can be used as an ARX, MAX, ARMAX or ARIMAX model by setting only the corresponding parameters to non-zero values. Fractional Auto Regressive Integrated Moving Average (FARIMA) At times, it may happen that our series is not stationary, yet differencing with ‘d’ parameter taking the value 1 may over-difference it. So, we need to difference the time series using a fractional value. In the world of data science there is no one superior model, the model that works on your data depends greatly on your dataset. Knowledge of various models allows us to choose one that work on our data and experimenting with that model to achieve the best results. And results should be seen as plot as well as error metrics, at times a small error may also be bad, hence, plotting and visualizing the results is essential. In the next chapter, we will be looking at another statistical model, exponential smoothing.
Time Series – Introduction A time series is a sequence of observations over a certain period. A univariate time series consists of the values taken by a single variable at periodic time instances over a period, and a multivariate time series consists of the values taken by multiple variables at the same periodic time instances over a period. The simplest example of a time series that all of us come across on a day to day basis is the change in temperature throughout the day or week or month or year. The analysis of temporal data is capable of giving us useful insights on how a variable changes over time, or how it depends on the change in the values of other variable(s). This relationship of a variable on its previous values and/or other variables can be analyzed for time series forecasting and has numerous applications in artificial intelligence.
Theano Tutorial Job Search Theano is a Python library that lets you define mathematical expressions used in Machine Learning, optimize these expressions and evaluate those very efficiently by decisively using GPUs in critical areas. It can rival typical full C-implementations in most of the cases. Audience This tutorial is designed to help all those learners who are aiming to develop Deep Learning Projects. Prerequisites Before you proceed with this tutorial, prior exposure to Python, NumPy, Neural Networks, and Deep Learning is necessary.
Theano – Introduction Have you developed Machine Learning models in Python? Then, obviously you know the intricacies in developing these models. The development is typically a slow process taking hours and days of computational power. The Machine Learning model development requires lot of mathematical computations. These generally require arithmetic computations especially large matrices of multiple dimensions. These days we use Neural Networks rather than the traditional statistical techniques for developing Machine Learning applications. The Neural Networks need to be trained over a huge amount of data. The training is done in batches of data of reasonable size. Thus, the learning process is iterative. Thus, if the computations are not done efficiently, training the network can take several hours or even days. Thus, the optimization of the executable code is highly desired. And that is what exactly Theano provides. Theano is a Python library that lets you define mathematical expressions used in Machine Learning, optimize these expressions and evaluate those very efficiently by decisively using GPUs in critical areas. It can rival typical full C-implementations in most of the cases. Theano was written at the LISA lab with the intention of providing rapid development of efficient machine learning algorithms. It is released under a BSD license. In this tutorial, you will learn to use Theano library.
Theano – A Trivial Theano Expression Let us begin our journey of Theano by defining and evaluating a trivial expression in Theano. Consider the following trivial expression that adds two scalars − c = a + b Where a, b are variables and c is the expression output. In Theano, defining and evaluating even this trivial expression is tricky. Let us understand the steps to evaluate the above expression. Importing Theano First, we need to import Theano library in our program, which we do using the following statement − from theano import * Rather than importing the individual packages, we have used * in the above statement to include all packages from the Theano library. Declaring Variables Next, we will declare a variable called a using the following statement − a = tensor.dscalar() The dscalar method declares a decimal scalar variable. The execution of the above statement creates a variable called a in your program code. Likewise, we will create variable b using the following statement − b = tensor.dscalar() Defining Expression Next, we will define our expression that operates on these two variables a and b. c = a + b In Theano, the execution of the above statement does not perform the scalar addition of the two variables a and b. Defining Theano Function To evaluate the above expression, we need to define a function in Theano as follows − f = theano.function([a,b], c) The function function takes two arguments, the first argument is an input to the function and the second one is its output. The above declaration states that the first argument is of type array consisting of two elements a and b. The output is a scalar unit called c. This function will be referenced with the variable name f in our further code. Invoking Theano Function The call to the function f is made using the following statement − d = f(3.5, 5.5) The input to the function is an array consisting of two scalars: 3.5 and 5.5. The output of execution is assigned to the scalar variable d. To print the contents of d, we will use the print statement − print (d) The execution would cause the value of d to be printed on the console, which is 9.0 in this case. Full Program Listing The complete program listing is given here for your quick reference − from theano import * a = tensor.dscalar() b = tensor.dscalar() c = a + b f = theano.function([a,b], c) d = f(3.5, 5.5) print (d) Execute the above code and you will see the output as 9.0. The screen shot is shown here − Now, let us discuss a slightly more complex example that computes the multiplication of two matrices.
Image Recognition using TensorFlow TensorFlow includes a special feature of image recognition and these images are stored in a specific folder. With relatively same images, it will be easy to implement this logic for security purposes. The folder structure of image recognition code implementation is as shown below − The dataset_image includes the related images, which need to be loaded. We will focus on image recognition with our logo defined in it. The images are loaded with “load_data.py” script, which helps in keeping a note on various image recognition modules within them. import pickle from sklearn.model_selection import train_test_split from scipy import misc import numpy as np import os label = os.listdir(“dataset_image”) label = label[1:] dataset = [] for image_label in label: images = os.listdir(“dataset_image/”+image_label) for image in images: img = misc.imread(“dataset_image/”+image_label+”/”+image) img = misc.imresize(img, (64, 64)) dataset.append((img,image_label)) X = [] Y = [] for input,image_label in dataset: X.append(input) Y.append(label.index(image_label)) X = np.array(X) Y = np.array(Y) X_train,y_train, = X,Y data_set = (X_train,y_train) save_label = open(“int_to_word_out.pickle”,”wb”) pickle.dump(label, save_label) save_label.close() The training of images helps in storing the recognizable patterns within specified folder. import numpy import matplotlib.pyplot as plt from keras.layers import Dropout from keras.layers import Flatten from keras.constraints import maxnorm from keras.optimizers import SGD from keras.layers import Conv2D from keras.layers.convolutional import MaxPooling2D from keras.utils import np_utils from keras import backend as K import load_data from keras.models import Sequential from keras.layers import Dense import keras K.set_image_dim_ordering(”tf”) # fix random seed for reproducibility seed = 7 numpy.random.seed(seed) # load data (X_train,y_train) = load_data.data_set # normalize inputs from 0-255 to 0.0-1.0 X_train = X_train.astype(”float32”) #X_test = X_test.astype(”float32”) X_train = X_train / 255.0 #X_test = X_test / 255.0 # one hot encode outputs y_train = np_utils.to_categorical(y_train) #y_test = np_utils.to_categorical(y_test) num_classes = y_train.shape[1] # Create the model model = Sequential() model.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), padding = ”same”, activation = ”relu”, kernel_constraint = maxnorm(3))) model.add(Dropout(0.2)) model.add(Conv2D(32, (3, 3), activation = ”relu”, padding = ”same”, kernel_constraint = maxnorm(3))) model.add(MaxPooling2D(pool_size = (2, 2))) model.add(Flatten()) model.add(Dense(512, activation = ”relu”, kernel_constraint = maxnorm(3))) model.add(Dropout(0.5)) model.add(Dense(num_classes, activation = ”softmax”)) # Compile model epochs = 10 lrate = 0.01 decay = lrate/epochs sgd = SGD(lr = lrate, momentum = 0.9, decay = decay, nesterov = False) model.compile(loss = ”categorical_crossentropy”, optimizer = sgd, metrics = [”accuracy”]) print(model.summary()) #callbacks = [keras.callbacks.EarlyStopping( monitor = ”val_loss”, min_delta = 0, patience = 0, verbose = 0, mode = ”auto”)] callbacks = [keras.callbacks.TensorBoard(log_dir=”./logs”, histogram_freq = 0, batch_size = 32, write_graph = True, write_grads = False, write_images = True, embeddings_freq = 0, embeddings_layer_names = None, embeddings_metadata = None)] # Fit the model model.fit(X_train, y_train, epochs = epochs, batch_size = 32,shuffle = True,callbacks = callbacks) # Final evaluation of the model scores = model.evaluate(X_train, y_train, verbose = 0) print(“Accuracy: %.2f%%” % (scores[1]*100)) # serialize model to JSONx model_json = model.to_json() with open(“model_face.json”, “w”) as json_file: json_file.write(model_json) # serialize weights to HDF5 model.save_weights(“model_face.h5”) print(“Saved model to disk”) The above line of code generates an output as shown below −