Learn CNTK – Monitoring the Model work project make money

CNTK – Monitoring the Model In this chapter, we will understand how to monitor a model in CNTK. Introduction In previous sections, we have done some validation on our NN models. But, is it also necessary and possible to monitor our model during training? Yes, already we have used ProgressWriter class to monitor our model and there are many more ways to do so. Before getting deep into the ways, first let’s have a look how monitoring in CNTK works and how we can use it to detect problems in our NN model. Callbacks in CNTK Actually, during training and validation, CNTK allows us to specify callbacks in several spots in the API. First, let’s take a closer look at when CNTK invokes callbacks. When CNTK invoke callbacks? CNTK will invoke the callbacks at the training and testing set moments when− A minibatch is completed. A full sweep over the dataset is completed during training. A minibatch of testing is completed. A full sweep over the dataset is completed during testing. Specifying callbacks While working with CNTK, we can specify callbacks in several spots in the API. For example− When call train on a loss function? Here, when we call train on a loss function, we can specify a set of callbacks through the callbacks argument as follows− training_summary=loss.train((x_train,y_train), parameter_learners=[learner], callbacks=[progress_writer]), minibatch_size=16, max_epochs=15) When working with minibatch sources or using a manual minibatch loop− In this case, we can specify callbacks for monitoring purpose while creating the Trainer as follows− from cntk.logging import ProgressPrinter callbacks = [ ProgressPrinter(0) ] Trainer = Trainer(z, (loss, metric), learner, [callbacks]) Various monitoring tools Let us study about different monitoring tools. ProgressPrinter While reading this tutorial, you will find ProgressPrinter as the most used monitoring tool. Some of the characteristics of ProgressPrinter monitoring tool are− ProgressPrinter class implements basic console-based logging to monitor our model. It can log to disk we want it to. Especially useful while working in a distributed training scenario. It is also very useful while working in a scenario where we can’t log in on the console to see the output of our Python program. With the help of following code, we can create an instance of ProgressPrinter− ProgressPrinter(0, log_to_file=’test.txt’) We will get the output something that we have seen in the earlier sections− Test.txt CNTKCommandTrainInfo: train : 300 CNTKCommandTrainInfo: CNTKNoMoreCommands_Total : 300 CNTKCommandTrainBegin: train ——————————————————————- average since average since examples loss last metric last —————————————————— Learning rate per minibatch: 0.1 1.45 1.45 -0.189 -0.189 16 1.24 1.13 -0.0382 0.0371 48 [………] TensorBoard One of the disadvantages of using ProgressPrinter is that, we can’t get a good view of how the loss and metric progress over time is hard. TensorBoardProgressWriter is a great alternative to the ProgressPrinter class in CNTK. Before using it, we need to first install it with the help of following command − pip install tensorboard Now, in order to use TensorBoard, we need to set up TensorBoardProgressWriter in our training code as follows− import time from cntk.logging import TensorBoardProgressWriter tensorbrd_writer = TensorBoardProgressWriter(log_dir=’logs/{}’.format(time.time()),freq=1,model=z) It is a good practice to call the close method on TensorBoardProgressWriter instance after done with the training of NNmodel. We can visualise the TensorBoard logging data with the help of following command − Tensorboard –logdir logs

Learn CNTK – Convolutional Neural Network work project make money

CNTK – Convolutional Neural Network In this chapter, let us study how to construct a Convolutional Neural Network (CNN) in CNTK. Introduction Convolutional neural networks (CNNs) are also made up of neurons, that have learnable weights and biases. That’s why in this manner, they are like ordinary neural networks (NNs). If we recall the working of ordinary NNs, every neuron receives one or more inputs, takes a weighted sum and it passed through an activation function to produce the final output. Here, the question arises that if CNNs and ordinary NNs have so many similarities then what makes these two networks different to each other? What makes them different is the treatment of input data and types of layers? The structure of input data is ignored in ordinary NN and all the data is converted into 1-D array before feeding it into the network. But, Convolutional Neural Network architecture can consider the 2D structure of the images, process them and allow it to extract the properties that are specific to images. Moreover, CNNs have the advantage of having one or more Convolutional layers and pooling layer, which are the main building blocks of CNNs. These layers are followed by one or more fully connected layers as in standard multilayer NNs. So, we can think of CNN, as a special case of fully connected networks. Convolutional Neural Network (CNN) architecture The architecture of CNN is basically a list of layers that transforms the 3-dimensional, i.e. width, height and depth of image volume into a 3-dimensional output volume. One important point to note here is that, every neuron in the current layer is connected to a small patch of the output from the previous layer, which is like overlaying a N*N filter on the input image. It uses M filters, which are basically feature extractors that extract features like edges, corner and so on. Following are the layers [INPUT-CONV-RELU-POOL-FC] that are used to construct Convolutional neural networks (CNNs)− INPUT− As the name implies, this layer holds the raw pixel values. Raw pixel values mean the data of the image as it is. Example, INPUT [64×64×3] is a 3-channeled RGB image of width-64, height-64 and depth-3. CONV− This layer is one of the building blocks of CNNs as most of the computation is done in this layer. Example – if we use 6 filters on the above mentioned INPUT [64×64×3], this may result in the volume [64×64×6]. RELU−Also called rectified linear unit layer, that applies an activation function to the output of previous layer. In other manner, a non-linearity would be added to the network by RELU. POOL− This layer, i.e. Pooling layer is one other building block of CNNs. The main task of this layer is down-sampling, which means it operates independently on every slice of the input and resizes it spatially. FC− It is called Fully Connected layer or more specifically the output layer. It is used to compute output class score and the resulting output is volume of the size 1*1*L where L is the number corresponding to class score. The diagram below represents the typical architecture of CNNs− Creating CNN structure We have seen the architecture and the basics of CNN, now we are going to building convolutional network using CNTK. Here, we will first see how to put together the structure of the CNN and then we will look at how to train the parameters of it. At last we’ll see, how we can improve the neural network by changing its structure with various different layer setups. We are going to use MNIST image dataset. So, first let’s create a CNN structure. Generally, when we build a CNN for recognizing patterns in images, we do the following− We use a combination of convolution and pooling layers. One or more hidden layer at the end of the network. At last, we finish the network with a softmax layer for classification purpose. With the help of following steps, we can build the network structure− Step 1− First, we need to import the required layers for CNN. from cntk.layers import Convolution2D, Sequential, Dense, MaxPooling Step 2− Next, we need to import the activation functions for CNN. from cntk.ops import log_softmax, relu Step 3− After that in order to initialize the convolutional layers later, we need to import the glorot_uniform_initializer as follows− from cntk.initializer import glorot_uniform Step 4− Next, to create input variables import the input_variable function. And import default_option function, to make configuration of NN a bit easier. from cntk import input_variable, default_options Step 5− Now to store the input images, create a new input_variable. It will contain three channels namely red, green and blue. It would have the size of 28 by 28 pixels. features = input_variable((3,28,28)) Step 6−Next, we need to create another input_variable to store the labels to predict. labels = input_variable(10) Step 7− Now, we need to create the default_option for the NN. And, we need to use the glorot_uniform as the initialization function. with default_options(initialization=glorot_uniform, activation=relu): Step 8− Next, in order to set the structure of the NN, we need to create a new Sequential layer set. Step 9− Now we need to add a Convolutional2D layer with a filter_shape of 5 and a strides setting of 1, within the Sequential layer set. Also, enable padding, so that the image is padded to retain the original dimensions. model = Sequential([ Convolution2D(filter_shape=(5,5), strides=(1,1), num_filters=8, pad=True), Step 10− Now it’s time to add a MaxPooling layer with filter_shape of 2, and a strides setting of 2 to compress the image by half. MaxPooling(filter_shape=(2,2), strides=(2,2)), Step 11− Now, as we did in step 9, we need to add another Convolutional2D layer with a filter_shape of 5 and a strides setting of 1, use 16 filters. Also, enable padding, so that, the size of the image produced by the previous pooling layer should be retained. Convolution2D(filter_shape=(5,5), strides=(1,1), num_filters=16, pad=True), Step 12− Now, as we did in step 10, add another MaxPooling layer with a filter_shape of 3 and a

Learn CNTK – Recurrent Neural Network work project make money

CNTK – Recurrent Neural Network Now, let us understand how to construct a Recurrent Neural Network (RNN) in CNTK. Introduction We learned how to classify images with a neural network, and it is one of the iconic jobs in deep learning. But, another area where neural network excels at and lot of research happening is Recurrent Neural Networks (RNN). Here, we are going to know what RNN is and how it can be used in scenarios where we need to deal with time-series data. What is Recurrent Neural Network? Recurrent neural networks (RNNs) may be defined as the special breed of NNs that are capable of reasoning over time. RNNs are mainly used in scenarios, where we need to deal with values that change over time, i.e. time-series data. In order to understand it in a better way, let’s have a small comparison between regular neural networks and recurrent neural networks − As we know that, in a regular neural network, we can provide only one input. This limits it to results in only one prediction. To give you an example, we can do translating text job by using regular neural networks. On the other hand, in recurrent neural networks, we can provide a sequence of samples that result in a single prediction. In other words, using RNNs we can predict an output sequence based on an input sequence. For example, there have been quite a few successful experiments with RNN in translation tasks. Uses of Recurrent Neural Network RNNs can be used in several ways. Some of them are as follows − Predicting a single output Before getting deep dive into the steps, that how RNN can predict a single output based on a sequence, let’s see how a basic RNN looks like− As we can in the above diagram, RNN contains a loopback connection to the input and whenever, we feed a sequence of values it will process each element in the sequence as time steps. Moreover, because of the loopback connection, RNN can combine the generated output with input for the next element in the sequence. In this way, RNN will build a memory over the whole sequence which can be used to make a prediction. In order to make prediction with RNN, we can perform the following steps− First, to create an initial hidden state, we need to feed the first element of the input sequence. After that, to produce an updated hidden state, we need to take the initial hidden state and combine it with the second element in the input sequence. At last, to produce the final hidden state and to predict the output for the RNN, we need to take the final element in the input sequence. In this way, with the help of this loopback connection we can teach a RNN to recognize patterns that happen over time. Predicting a sequence The basic model, discussed above, of RNN can be extended to other use cases as well. For example, we can use it to predict a sequence of values based on a single input. In this scenario, order to make prediction with RNN we can perform the following steps − First, to create an initial hidden state and predict the first element in the output sequence, we need to feed an input sample into the neural network. After that, to produce an updated hidden state and the second element in the output sequence, we need to combine the initial hidden state with the same sample. At last, to update the hidden state one more time and predict the final element in output sequence, we feed the sample another time. Predicting sequences As we have seen how to predict a single value based on a sequence and how to predict a sequence based on a single value. Now let’s see how we can predict sequences for sequences. In this scenario, order to make prediction with RNN we can perform the following steps − First, to create an initial hidden state and predict the first element in the output sequence, we need to take the first element in the input sequence. After that, to update the hidden state and predict the second element in the output sequence, we need to take the initial hidden state. At last, to predict the final element in the output sequence, we need to take the updated hidden state and the final element in the input sequence. Working of RNN To understand the working of recurrent neural networks (RNNs) we need to first understand how recurrent layers in the network work. So first let’s discuss how e can predict the output with a standard recurrent layer. Predicting output with standard RNN layer As we discussed earlier also that a basic layer in RNN is quite different from a regular layer in a neural network. In previous section, we also demonstrated in the diagram the basic architecture of RNN. In order to update the hidden state for the first-time step-in sequence we can use the following formula − In the above equation, we calculate the new hidden state by calculating the dot product between the initial hidden state and a set of weights. Now for the next step, the hidden state for the current time step is used as the initial hidden state for the next time step in the sequence. That’s why, to update the hidden state for the second time step, we can repeat the calculations performed in the first-time step as follows − Next, we can repeat the process of updating the hidden state for the third and final step in the sequence as below − And when we have processed all the above steps in the sequence, we can calculate the output as follows − For the above formula, we have used a third set of weights and the hidden state from the final time step. Advanced Recurrent Units The main issue with basic recurrent layer is of vanishing gradient problem and due to this

Learn Microsoft Cognitive Toolkit – Quick Guide work project make money

Microsoft Cognitive Toolkit – Quick Guide Microsoft Cognitive Toolkit (CNTK) – Introduction In this chapter, we will learn what is CNTK, its features, difference between its version 1.0 and 2.0 and important highlights of version 2.7. What is Microsoft Cognitive Toolkit (CNTK)? Microsoft Cognitive Toolkit (CNTK), formerly known as Computational Network Toolkit, is a free, easy-to-use, open-source, commercial-grade toolkit that enables us to train deep learning algorithms to learn like the human brain. It enables us to create some popular deep learning systems like feed-forward neural network time series prediction systems and Convolutional neural network (CNN) image classifiers. For optimal performance, its framework functions are written in C++. Although we can call its function using C++, but the most commonly used approach for the same is to use a Python program. CNTK’s Features Following are some of the features and capabilities offered in the latest version of Microsoft CNTK: Built-in components CNTK has highly optimised built-in components that can handle multi-dimensional dense or sparse data from Python, C++ or BrainScript. We can implement CNN, FNN, RNN, Batch Normalisation and Sequence-to-Sequence with attention. It provides us the functionality to add new user-defined core-components on the GPU from Python. It also provides automatic hyperparameter tuning. We can implement Reinforcement learning, Generative Adversarial Networks (GANs), Supervised as well as Unsupervised learning. For massive datasets, CNTK has built-in optimised readers. Usage of resources efficiently CNTK provides us parallelism with high accuracy on multiple GPUs/machines via 1-bit SGD. To fit the largest models in GPU memory, it provides memory sharing and other built-in methods. Express our own networks easily CNTK has full APIs for defining your own network, learners, readers, training and evaluation from Python, C++, and BrainScript. Using CNTK, we can easily evaluate models with Python, C++, C# or BrainScript. It provides both high-level as well as low-level APIs. Based on our data, it can automatically shape the inference. It has fully optimised symbolic Recurrent Neural Network (RNN) loops. Measuring model performance CNTK provides various components to measure the performance of neural networks you build. Generates log data from your model and the associated optimiser, which we can use to monitor the training process. Version 1.0 vs Version 2.0 Following table compares CNTK Version 1.0 and 2.0: Version 1.0 Version 2.0 It was released in 2016. It is a significant rewrite of the 1.0 Version and was released in June 2017. It used a proprietary scripting language called BrainScript. Its framework functions can be called using C++, Python. We can easily load our modules in C# or Java. BrainScript is also supported by Version 2.0. It runs on both Windows and Linux systems but not directly on Mac OS. It also runs on both Windows (Win 8.1, Win 10, Server 2012 R2 and later) and Linux systems but not directly on Mac OS. Important Highlights of Version 2.7 Version 2.7 is the last main released version of Microsoft Cognitive Toolkit. It has full support for ONNX 1.4.1. Following are some important highlights of this last released version of CNTK. Full support for ONNX 1.4.1. Support for CUDA 10 for both Windows and Linux systems. It supports advance Recurrent Neural Networks (RNN) loop in ONNX export. It can export more than 2GB models in ONNX format. It supports FP16 in BrainScript scripting language’s training action. Microsoft Cognitive Toolkit (CNTK) – Getting Started Here, we will understand about the installation of CNTK on Windows and on Linux. Moreover, the chapter explains installing CNTK package, steps to install Anaconda, CNTK files, directory structure and CNTK library organisation. Prerequisites In order to install CNTK, we must have Python installed on our computers. You can go to the link and select the latest version for your OS, i.e. Windows and Linux/Unix. For basic tutorial on Python, you can refer to the link . CNTK is supported for Windows as well as Linux so we will walk through both of them. Installing on Windows In order to run CNTK on Windows, we will be using the Anaconda version of Python. We know that, Anaconda is a redistribution of Python. It includes additional packages like Scipy andScikit-learn which are used by CNTK to perform various useful calculations. So, first let see the steps to install Anaconda on your machine − Step 1−First download the setup files from the public website . Step 2 − Once you downloaded the setup files, start the installation and follow the instructions from the link . Step 3 − Once installed, Anaconda will also install some other utilities, which will automatically include all the Anaconda executables in your computer PATH variable. We can manage our Python environment from this prompt, can install packages and run Python scripts. Installing CNTK package Once Anaconda installation is done, you can use the most common way to install the CNTK package through the pip executable by using following command − pip install cntk There are various other methods to install Cognitive Toolkit on your machine. Microsoft has a neat set of documentation that explains the other installation methods in detail. Please follow the link . Installing on Linux Installation of CNTK on Linux is a bit different from its installation on Windows. Here, for Linux we are going to use Anaconda to install CNTK, but instead of a graphical installer for Anaconda, we will be using a terminal-based installer on Linux. Although, the installer will work with almost all Linux distributions, we limited the description to Ubuntu. So, first let see the steps to install Anaconda on your machine − Steps to install Anaconda Step 1 − Before installing Anaconda, make sure that the system is fully up to date. To check, first execute the following two commands inside a terminal − sudo apt update sudo apt upgrade Step 2 − Once the computer is updated, get the URL from the public website for the latest Anaconda installation files. Step 3 − Once URL is copied, open a terminal window and execute the following command − wget -0 anaconda-installer.sh

Learn CNTK – Regression Model work project make money

CNTK – Regression Model Here, we will study about measuring performance with regards to a regression model. Basics of validating a regression model As we know that regression models are different than classification models, in the sense that, there is no binary measure of right or wrong for individuals’ samples. In regression models, we want to measure how close the prediction is to the actual value. The closer the prediction value is to the expected output, the better the model performs. Here, we are going to measure the performance of NN used for regression using different error-rate functions. Calculating error margin As discussed earlier, while validating a regression model, we can’t say whether a prediction is right or wrong. We want our prediction to be as close as possible to the real value. But, a small error margin is acceptable here. The formula for calculating the error margin is as follows − Here, Predicted value = indicated y by a hat Real value = predicted by y First, we need to calculate the distance between the predicted and the real value. Then, to get an overall error rate, we need to sum these squared distances and calculate the average. This is called the mean squared error function. But, if we want performance figures that express an error margin, we need a formula that expresses the absolute error. The formula for mean absolute error function is as follows − The above formula takes the absolute distance between the predicted and the real value. Using CNTK to measure regression performance Here, we will look at how to use the different metrics, we discussed in combination with CNTK. We will use a regression model, that predicts miles per gallon for cars using the steps given below. Implementation steps− Step 1 − First, we need to import the required components from cntk package as follows − from cntk import default_option, input_variable from cntk.layers import Dense, Sequential from cntk.ops import relu Step 2 − Next, we need to define a default activation function using the default_options functions. Then, create a new Sequential layer set and provide two Dense layers with 64 neurons each. Then, we add an additional Dense layer (which will act as the output layer) to the Sequential layer set and give 1 neuron without an activation as follows − with default_options(activation=relu): model = Sequential([Dense(64),Dense(64),Dense(1,activation=None)]) Step 3 − Once the network has been created, we need to create an input feature. We need to make sure that, it has the same shape as the features that we are going to be using for training. features = input_variable(X.shape[1]) Step 4 − Now, we need to create another input_variable with size 1. It will be used to store the expected value for NN. target = input_variable(1) z = model(features) Now, we need to train the model and in order to do so, we are going to split the dataset and perform preprocessing using the following implementation steps − Step 5 −First, import StandardScaler from sklearn.preprocessing to get the values between -1 and +1. This will help us against exploding gradient problems in the NN. from sklearn.preprocessing import StandardScalar Step 6 − Next, import train_test_split from sklearn.model_selection as follows− from sklearn.model_selection import train_test_split Step 7 − Drop the mpg column from the dataset by using the dropmethod. At last split the dataset into a training and validation set using the train_test_split function as follows − x = df_cars.drop(columns=[‘mpg’]).values.astype(np.float32) y=df_cars.iloc[: , 0].values.reshape(-1, 1).astype(np.float32) scaler = StandardScaler() X = scaler.fit_transform(x) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) Step 8 − Now, we need to create another input_variable with size 1. It will be used to store the expected value for NN. target = input_variable(1) z = model(features) We have split as well as preprocessed the data, now we need to train the NN. As did in previous sections while creating regression model, we need to define a combination of a loss and metric function to train the model. import cntk def absolute_error(output, target): return cntk.ops.reduce_mean(cntk.ops.abs(output – target)) @ cntk.Function def criterion_factory(output, target): loss = squared_error(output, target) metric = absolute_error(output, target) return loss, metric Now, let’s have a look at how to use the trained model. For our model, we will use criterion_factory as the loss and metric combination. from cntk.losses import squared_error from cntk.learners import sgd from cntk.logging import ProgressPrinter progress_printer = ProgressPrinter(0) loss = criterion_factory (z, target) learner = sgd(z.parameters, 0.001) training_summary=loss.train((x_train,y_train),parameter_learners=[learner],callbacks=[progress_printer],minibatch_size=16,max_epochs=10) Complete implementation example from cntk import default_option, input_variable from cntk.layers import Dense, Sequential from cntk.ops import relu with default_options(activation=relu): model = Sequential([Dense(64),Dense(64),Dense(1,activation=None)]) features = input_variable(X.shape[1]) target = input_variable(1) z = model(features) from sklearn.preprocessing import StandardScalar from sklearn.model_selection import train_test_split x = df_cars.drop(columns=[‘mpg’]).values.astype(np.float32) y=df_cars.iloc[: , 0].values.reshape(-1, 1).astype(np.float32) scaler = StandardScaler() X = scaler.fit_transform(x) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) target = input_variable(1) z = model(features) import cntk def absolute_error(output, target): return cntk.ops.reduce_mean(cntk.ops.abs(output – target)) @ cntk.Function def criterion_factory(output, target): loss = squared_error(output, target) metric = absolute_error(output, target) return loss, metric from cntk.losses import squared_error from cntk.learners import sgd from cntk.logging import ProgressPrinter progress_printer = ProgressPrinter(0) loss = criterion_factory (z, target) learner = sgd(z.parameters, 0.001) training_summary=loss.train((x_train,y_train),parameter_learners=[learner],callbacks=[progress_printer],minibatch_size=16,max_epochs=10) Output ——————————————————————- average since average since examples loss last metric last —————————————————— Learning rate per minibatch: 0.001 690 690 24.9 24.9 16 654 636 24.1 23.7 48 [………] In order to validate our regression model, we need to make sure that, the model handles new data just as well as it does with the training data. For this, we need to invoke the test method on loss and metric combination with test data as follows − loss.test([X_test, y_test]) Output− {”metric”: 1.89679785619, ”samples”: 79}

Learn CNTK – Out-of-Memory Datasets work project make money

CNTK – Out-of-Memory Datasets In this chapter, how to measure performance of out-of-memory datasets will be explained. In previous sections, we have discussed about various methods to validate the performance of our NN, but the methods we have discussed, are ones that deals with the datasets that fit in the memory. Here, the question arises what about out-of-memory datasets, because in production scenario, we need a lot of data to train NN. In this section, we are going to discuss how to measure performance when working with minibatch sources and manual minibatch loop. Minibatch sources While working with out-of-memory dataset, i.e. minibatch sources, we need slightly different setup for loss, as well as metric, than the setup we used while working with small datasets i.e. in-memory datasets. First, we will see how to set up a way to feed data to the trainer of NN model. Following are the implementation steps− Step 1 − First, from cntk.io module import the components for creating the minibatch source as follows− from cntk.io import StreamDef, StreamDefs, MinibatchSource, CTFDeserializer, INFINITY_REPEAT Step 2 − Next, create a new function named say create_datasource. This function will have two parameters namely filename and limit, with a default value of INFINITELY_REPEAT. def create_datasource(filename, limit =INFINITELY_REPEAT) Step 3 − Now, within the function, by using StreamDef class crate a stream definition for the labels that reads from the labels field that has three features. We also need to set is_sparse to False as follows− labels_stream = StreamDef(field=’labels’, shape=3, is_sparse=False) Step 4 − Next, create to read the features filed from the input file, create another instance of StreamDef as follows. feature_stream = StreamDef(field=’features’, shape=4, is_sparse=False) Step 5 − Now, initialise the CTFDeserializer instance class. Specify the filename and streams that we need to deserialize as follows − deserializer = CTFDeserializer(filename, StreamDefs(labels= label_stream, features=features_stream) Step 6 − Next, we need to create instance of minisourceBatch by using deserializer as follows − Minibatch_source = MinibatchSource(deserializer, randomize=True, max_sweeps=limit) return minibatch_source Step 7 − At last, we need to provide training and testing source, which we created in previous sections also. We are using iris flower dataset. training_source = create_datasource(‘Iris_train.ctf’) test_source = create_datasource(‘Iris_test.ctf’, limit=1) Once you create MinibatchSource instance, we need to train it. We can use the same training logic, as used when we worked with small in-memory datasets. Here, we will use MinibatchSource instance, as the input for the train method on loss function as follows − Following are the implementation steps− Step 1 − In order to log the output of the training session, first import the ProgressPrinter from cntk.logging module as follows − from cntk.logging import ProgressPrinter Step 2 − Next, to set up the training session, import the trainer and training_session from cntk.train module as follows− from cntk.train import Trainer, training_session Step 3 − Now, we need to define some set of constants like minibatch_size, samples_per_epoch and num_epochs as follows− minbatch_size = 16 samples_per_epoch = 150 num_epochs = 30 max_samples = samples_per_epoch * num_epochs Step 4 − Next, in order to know how to read data during training in CNTK, we need to define a mapping between the input variable for the network and the streams in the minibatch source. input_map = { features: training_source.streams.features, labels: training_source.streams.labels } Step 5 − Next to log the output of the training process, initialize the progress_printer variable with a new ProgressPrinter instance. Also, initialize the trainer and provide it with the model as follows− progress_writer = ProgressPrinter(0) trainer: training_source.streams.labels Step 6 − At last, to start the training process, we need to invoke the training_session function as follows − session = training_session(trainer, mb_source=training_source, mb_size=minibatch_size, model_inputs_to_streams=input_map, max_samples=max_samples, test_config=test_config) session.train() Once we trained the model, we can add validation to this setup by using a TestConfig object and assign it to the test_config keyword argument of the train_session function. Following are the implementation steps− Step 1 − First, we need to import the TestConfig class from the module cntk.train as follows− from cntk.train import TestConfig Step 2 − Now, we need to create a new instance of the TestConfig with the test_source as input− Test_config = TestConfig(test_source) Complete Example from cntk.io import StreamDef, StreamDefs, MinibatchSource, CTFDeserializer, INFINITY_REPEAT def create_datasource(filename, limit =INFINITELY_REPEAT) labels_stream = StreamDef(field=’labels’, shape=3, is_sparse=False) feature_stream = StreamDef(field=’features’, shape=4, is_sparse=False) deserializer = CTFDeserializer(filename, StreamDefs(labels=label_stream, features=features_stream) Minibatch_source = MinibatchSource(deserializer, randomize=True, max_sweeps=limit) return minibatch_source training_source = create_datasource(‘Iris_train.ctf’) test_source = create_datasource(‘Iris_test.ctf’, limit=1) from cntk.logging import ProgressPrinter from cntk.train import Trainer, training_session minbatch_size = 16 samples_per_epoch = 150 num_epochs = 30 max_samples = samples_per_epoch * num_epochs input_map = { features: training_source.streams.features, labels: training_source.streams.labels } progress_writer = ProgressPrinter(0) trainer: training_source.streams.labels session = training_session(trainer, mb_source=training_source, mb_size=minibatch_size, model_inputs_to_streams=input_map, max_samples=max_samples, test_config=test_config) session.train() from cntk.train import TestConfig Test_config = TestConfig(test_source) Output ——————————————————————- average since average since examples loss last metric last —————————————————— Learning rate per minibatch: 0.1 1.57 1.57 0.214 0.214 16 1.38 1.28 0.264 0.289 48 [………] Finished Evaluation [1]: Minibatch[1-1]:metric = 69.65*30; Manual minibatch loop As we see above, it is easy to measure the performance of our NN model during and after training, by using the metrics when training with regular APIs in CNTK. But, on the other side, things will not be that easy while working with a manual minibatch loop. Here, we are using the model given below with 4 inputs and 3 outputs from Iris Flower dataset, created in previous sections too− from cntk import default_options, input_variable from cntk.layers import Dense, Sequential from cntk.ops import log_softmax, relu, sigmoid from cntk.learners import sgd model = Sequential([ Dense(4, activation=sigmoid), Dense(3, activation=log_softmax) ]) features = input_variable(4) labels = input_variable(3) z = model(features) Next, the loss for the model is defined as the combination of the cross-entropy loss function, and the F-measure metric as used in previous sections. We are going to use the criterion_factory utility, to create this as a CNTK function object as shown below− import cntk from cntk.losses import cross_entropy_with_softmax, fmeasure @cntk.Function def criterion_factory(outputs, targets): loss = cross_entropy_with_softmax(outputs, targets) metric = fmeasure(outputs, targets, beta=1) return loss, metric loss = criterion_factory(z, labels) learner = sgd(z.parameters, 0.1)

Learn CNTK – Neural Network Regression work project make money

CNTK – Neural Network Regression The chapter will help you understand the neural network regression with regards to CNTK. Introduction As we know that, in order to predict a numeric value from one or more predictor variables, we use regression. Let’s take an example of predicting the median value of a house in say one of the 100 towns. To do so, we have data that includes − A crime statistic for each town. The age of the houses in each town. A measure of the distance from each town to a prime location. The student-to-teacher ratio in each town. A racial demographic statistic for each town. The median house value in each town. Based on these five predictor variables, we would like to predict median house value. And for this we can create a linear regression model along the lines of− Y = a0+a1(crime)+a2(house-age)+(a3)(distance)+(a4)(ratio)+(a5)(racial) In the above equation − Y is a predicted median value a0 is a constant and a1 through a5 all are constants associated with the five predictors we discussed above. We also have an alternate approach of using a neural network. It will create more accurate prediction model. Here, we will be creating a neural network regression model by using CNTK. Loading Dataset To implement Neural Network regression using CNTK, we will be using Boston area house values dataset. The dataset can be downloaded from UCI Machine Learning Repository which is available at . This dataset has total 14 variables and 506 instances. But, for our implementation program we are going to use six of the 14 variables and 100 instances. Out of 6, 5 as predictors and one as a value-to-predict. From 100 instances, we will be using 80 for training and 20 for testing purpose. The value which we want to predict is the median house price in a town. Let’s see the five predictors we will be using − Crime per capita in the town − We would expect smaller values to be associated with this predictor. Proportion of owner − occupied units built before 1940 – We would expect smaller values to be associated with this predictor because larger value means older house. Weighed distance of the town to five Boston employment centers. Area school pupil-to-teacher ratio. An indirect metric of the proportion of black residents in the town. Preparing training & test files As we did before, first we need to convert the raw data into CNTK format. We are going to use first 80 data items for training purpose, so the tab-delimited CNTK format is as follows − |predictors 1.612820 96.90 3.76 21.00 248.31 |medval 13.50 |predictors 0.064170 68.20 3.36 19.20 396.90 |medval 18.90 |predictors 0.097440 61.40 3.38 19.20 377.56 |medval 20.00 . . . Next 20 items, also converted into CNTK format, will used for testing purpose. Constructing Regression model First, we need to process the data files in CNTK format and for that, we are going to use the helper function named create_reader as follows − def create_reader(path, input_dim, output_dim, rnd_order, sweeps): x_strm = C.io.StreamDef(field=”predictors”, shape=input_dim, is_sparse=False) y_strm = C.io.StreamDef(field=”medval”, shape=output_dim, is_sparse=False) streams = C.io.StreamDefs(x_src=x_strm, y_src=y_strm) deserial = C.io.CTFDeserializer(path, streams) mb_src = C.io.MinibatchSource(deserial, randomize=rnd_order, max_sweeps=sweeps) return mb_src Next, we need to create a helper function that accepts a CNTK mini-batch object and computes a custom accuracy metric. def mb_accuracy(mb, x_var, y_var, model, delta): num_correct = 0 num_wrong = 0 x_mat = mb[x_var].asarray() y_mat = mb[y_var].asarray() for i in range(mb[x_var].shape[0]): v = model.eval(x_mat[i]) y = y_mat[i] if np.abs(v[0,0] – y[0,0]) < delta: num_correct += 1 else: num_wrong += 1 return (num_correct * 100.0)/(num_correct + num_wrong) Now, we need to set the architecture arguments for our NN and also provide the location of the data files. It can be done with the help of following python code − def main(): print(“Using CNTK version = ” + str(C.__version__) + “n”) input_dim = 5 hidden_dim = 20 output_dim = 1 train_file = “.\…\” #provide the name of the training file(80 data items) test_file = “.\…\” #provide the name of the test file(20 data items) Now, with the help of following code line our program will create the untrained NN − X = C.ops.input_variable(input_dim, np.float32) Y = C.ops.input_variable(output_dim, np.float32) with C.layers.default_options(init=C.initializer.uniform(scale=0.01, seed=1)): hLayer = C.layers.Dense(hidden_dim, activation=C.ops.tanh, name=”hidLayer”)(X) oLayer = C.layers.Dense(output_dim, activation=None, name=”outLayer”)(hLayer) model = C.ops.alias(oLayer) Now, once we have created the dual untrained model, we need to set up a Learner algorithm object. We are going to use SGD learner and squared_error loss function − tr_loss = C.squared_error(model, Y) max_iter = 3000 batch_size = 5 base_learn_rate = 0.02 sch=C.learning_parameter_schedule([base_learn_rate, base_learn_rate/2], minibatch_size=batch_size, epoch_size=int((max_iter*batch_size)/2)) learner = C.sgd(model.parameters, sch) trainer = C.Trainer(model, (tr_loss), [learner]) Now, once we finish with Learning algorithm object, we need to create a reader function to read the training data − rdr = create_reader(train_file, input_dim, output_dim, rnd_order=True, sweeps=C.io.INFINITELY_REPEAT) boston_input_map = { X : rdr.streams.x_src, Y : rdr.streams.y_src } Now, it’s time to train our NN model − for i in range(0, max_iter): curr_batch = rdr.next_minibatch(batch_size, input_map=boston_input_map) trainer.train_minibatch(curr_batch) if i % int(max_iter/10) == 0: mcee = trainer.previous_minibatch_loss_average acc = mb_accuracy(curr_batch, X, Y, model, delta=3.00) print(“batch %4d: mean squared error = %8.4f, accuracy = %5.2f%% ” % (i, mcee, acc)) Once we have done with training, let’s evaluate the model using test data items − print(“nEvaluating test data n”) rdr = create_reader(test_file, input_dim, output_dim, rnd_order=False, sweeps=1) boston_input_map = { X : rdr.streams.x_src, Y : rdr.streams.y_src } num_test = 20 all_test = rdr.next_minibatch(num_test, input_map=boston_input_map) acc = mb_accuracy(all_test, X, Y, model, delta=3.00) print(“Prediction accuracy = %0.2f%%” % acc) After evaluating the accuracy of our trained NN model, we will be using it for making a prediction on unseen data − np.set_printoptions(precision = 2, suppress=True) unknown = np.array([[0.09, 50.00, 4.5, 17.00, 350.00], dtype=np.float32) print(“nPredicting median home value for feature/predictor values: “) print(unknown[0]) pred_prob = model.eval({X: unknown) print(“nPredicted value is: “) print(“$%0.2f (x1000)” %pred_value[0,0]) Complete Regression Model import numpy as np import cntk as C def create_reader(path, input_dim, output_dim, rnd_order, sweeps): x_strm = C.io.StreamDef(field=”predictors”, shape=input_dim, is_sparse=False) y_strm = C.io.StreamDef(field=”medval”, shape=output_dim, is_sparse=False) streams = C.io.StreamDefs(x_src=x_strm, y_src=y_strm) deserial

Learn CNTK – Classification Model work project make money

CNTK – Classification Model This chapter will help you to understand how to measure performance of classification model in CNTK. Let us begin with confusion matrix. Confusion matrix Confusion matrix – a table with the predicted output versus the expected output is the easiest way to measure the performance of a classification problem, where the output can be of two or more type of classes. In order to understand how it works, we are going to create a confusion matrix for a binary classification model that predicts, whether a credit card transaction was normal or a fraud. It is shown as follows − Actual fraud Actual normal Predicted fraud True positive False positive Predicted normal False negative True negative As we can see, the above sample confusion matrix contains 2 columns, one for class fraud and other for class normal. In the same way we have 2 rows, one is added for class fraud and other is added for class normal. Following is the explanation of the terms associated with confusion matrix − True Positives − When both actual class & predicted class of data point is 1. True Negatives − When both actual class & predicted class of data point is 0. False Positives − When actual class of data point is 0 & predicted class of data point is 1. False Negatives − When actual class of data point is 1 & predicted class of data point is 0. Let’s see, how we can calculate number of different things from the confusion matrix − Accuracy − It is the number of correct predictions made by our ML classification model. It can be calculated with the help of following formula − Precision −It tells us how many samples were correctly predicted out of all samples we predicted. It can be calculated with the help of following formula − Recall or Sensitivity − Recall are the number of positives returned by our ML classification model. In other words, it tells us how many of the fraud cases in the dataset were actually detected by the model. It can be calculated with the help of following formula − Specificity − Opposite to recall, it gives the number of negatives returned by our ML classification model. It can be calculated with the help of following formula − F-measure We can use F-measure as an alternative of Confusion matrix. The main reason behind this, we can’t maximize Recall and Precision at the same time. There is a very strong relationship between these metrics and that can be understood with the help of following example − Suppose, we want to use a DL model to classify cell samples as cancerous or normal. Here, to reach maximum precision we need to reduce the number of predictions to 1. Although, this can give us reach around 100 percent precision, but recall will become really low. On the other hand, if we would like to reach maximum recall, we need to make as many predictions as possible. Although, this can give us reach around 100 percent recall, but precision will become really low. In practice, we need to find a way balancing between precision and recall. The F-measure metric allows us to do so, as it expresses a harmonic average between precision and recall. This formula is called the F1-measure, where the extra term called B is set to 1 to get an equal ratio of precision and recall. In order to emphasize recall, we can set the factor B to 2. On the other hand, to emphasize precision, we can set the factor B to 0.5. Using CNTK to measure classification performance In previous section we have created a classification model using Iris flower dataset. Here, we will be measuring its performance by using confusion matrix and F-measure metric. Creating Confusion matrix We already created the model, so we can start the validating process, which includes confusion matrix, on the same. First, we are going to create confusion matrix with the help of the confusion_matrix function from scikit-learn. For this, we need the real labels for our test samples and the predicted labels for the same test samples. Let’s calculate the confusion matrix by using following python code − from sklearn.metrics import confusion_matrix y_true = np.argmax(y_test, axis=1) y_pred = np.argmax(z(X_test), axis=1) matrix = confusion_matrix(y_true=y_true, y_pred=y_pred) print(matrix) Output [[10 0 0] [ 0 1 9] [ 0 0 10]] We can also use heatmap function to visualise a confusion matrix as follows − import seaborn as sns import matplotlib.pyplot as plt g = sns.heatmap(matrix, annot=True, xticklabels=label_encoder.classes_.tolist(), yticklabels=label_encoder.classes_.tolist(), cmap=”Blues”) g.set_yticklabels(g.get_yticklabels(), rotation=0) plt.show() We should also have a single performance number, that we can use to compare the model. For this, we need to calculate the classification error by using classification_error function, from the metrics package in CNTK as done while creating classification model. Now to calculate the classification error, execute the test method on the loss function with a dataset. After that, CNTK will take the samples we provided as input for this function and make a prediction based on input features X_test. loss.test([X_test, y_test]) Output {”metric”: 0.36666666666, ”samples”: 30} Implementing F-Measures For implementing F-Measures, CNTK also includes function called fmeasures. We can use this function, while training the NN by replacing the cell cntk.metrics.classification_error, with a call to cntk.losses.fmeasure when defining the criterion factory function as follows − import cntk @cntk.Function def criterion_factory(output, target): loss = cntk.losses.cross_entropy_with_softmax(output, target) metric = cntk.losses.fmeasure(output, target) return loss, metric After using cntk.losses.fmeasure function, we will get different output for the loss.test method call given as follows − loss.test([X_test, y_test]) Output {”metric”: 0.83101488749, ”samples”: 30}

Learn CNTK – Measuring Performance work project make money

CNTK – Measuring Performance This chapter will explain how to measure the model performance in CNKT. Strategy to validate model performance After building a ML model, we used to train it using a set of data samples. Because of this training our ML model learns and derive some general rules. The performance of ML model matters when we feed new samples, i.e., different samples than provided at the time of training, to the model. The model behaves differently in that case. It may be worse at making a good prediction on those new samples. But the model must work well for new samples as well because in production environment we will get different input than we used sample data for training purpose. That’s the reason, we should validate the ML model by using a set of samples different from the samples we used for training purpose. Here, we are going to discuss two different techniques for creating a dataset for validating a NN. Hold-out dataset It is one of the easiest methods for creating a dataset to validate a NN. As name implies, in this method we will be holding back one set of samples from training (say 20%) and using it to test the performance of our ML model. Following diagram shows the ratio between training and validation samples − Hold-out dataset model ensures that we have enough data to train our ML model and at the same time we will have a reasonable number of samples to get good measurement of model’s performance. In order to include in the training set and test set, it’s a good practice to choose random samples from the main dataset. It ensures an even distribution between training and test set. Following is an example in which we are producing own hold-out dataset by using train_test_split function from the scikit-learn library. Example from sklearn.datasets import load_iris iris = load_iris() X = iris.data y = iris.target from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1) # Here above test_size = 0.2 represents that we provided 20% of the data as test data. from sklearn.neighbors import KNeighborsClassifier from sklearn import metrics classifier_knn = KNeighborsClassifier(n_neighbors=3) classifier_knn.fit(X_train, y_train) y_pred = classifier_knn.predict(X_test) # Providing sample data and the model will make prediction out of that data sample = [[5, 5, 3, 2], [2, 4, 3, 5]] preds = classifier_knn.predict(sample) pred_species = [iris.target_names[p] for p in preds] print(“Predictions:”, pred_species) Output Predictions: [”versicolor”, ”virginica”] While using CNTK, we need to randomise the order of our dataset each time we train our model because − Deep learning algorithms are highly influenced by the random-number generators. The order in which we provide the samples to NN during training greatly affects its performance. The major downside of using the hold-out dataset technique is that it is unreliable because sometimes we get very good results but sometimes, we get bad results. K-fold cross validation To make our ML model more reliable, there is a technique called K-fold cross validation. In nature K-fold cross validation technique is same as the previous technique, but it repeats it several times-usually about 5 to 10 times. Following diagram represents its concept − Working of K-fold cross validation The working of K-fold cross validation can be understood with the help of following steps − Step 1 − Like in Hand-out dataset technique, in K-fold cross validation technique, first we need to split the dataset into a training and test set. Ideally, the ratio is 80-20, i.e. 80% of training set and 20% of test set. Step 2 − Next, we need to train our model using the training set. Step 3 −At last, we will be using the test set to measure the performance of our model. The only difference between Hold-out dataset technique and k-cross validation technique is that the above process gets repeated usually for 5 to 10 times and at the end the average is calculated over all the performance metrics. That average would be the final performance metrics. Let us see an example with a small dataset − Example from numpy import array from sklearn.model_selection import KFold data = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]) kfold = KFold(5, True, 1) for train, test in kfold.split(data): print(”train: %s, test: %s” % (data[train],(data[test])) Output train: [0.1 0.2 0.4 0.5 0.6 0.7 0.8 0.9], test: [0.3 1. ] train: [0.1 0.2 0.3 0.4 0.6 0.8 0.9 1. ], test: [0.5 0.7] train: [0.2 0.3 0.5 0.6 0.7 0.8 0.9 1. ], test: [0.1 0.4] train: [0.1 0.3 0.4 0.5 0.6 0.7 0.9 1. ], test: [0.2 0.8] train: [0.1 0.2 0.3 0.4 0.5 0.7 0.8 1. ], test: [0.6 0.9] As we see, because of using a more realistic training and test scenario, k-fold cross validation technique gives us a much more stable performance measurement but, on the downside, it takes a lot of time when validating deep learning models. CNTK does not support for k-cross validation, hence we need to write our own script to do so. Detecting underfitting and overfitting Whether, we use Hand-out dataset or k-fold cross-validation technique, we will discover that the output for the metrics will be different for dataset used for training and the dataset used for validation. Detecting overfitting The phenomenon called overfitting is a situation where our ML model, models the training data exceptionally well, but fails to perform well on the testing data, i.e. was not able to predict test data. It happens when a ML model learns a specific pattern and noise from the training data to such an extent, that it negatively impacts that model’s ability to generalise from the training data to new, i.e. unseen data. Here, noise is the irrelevant information or randomness in a dataset. Following are the two ways with the help of which we can detect weather our model is overfit or not − The overfit model will perform well on the same samples we used for training, but it will perform very

Learn Neural Network Classification work project make money

CNTK – Neural Network Classification In this chapter, we will study how to classify neural network by using CNTK. Introduction Classification may be defined as the process to predict categorial output labels or responses for the given input data. The categorised output, which will be based on what the model has learned in training phase, can have the form such as “Black” or “White” or “spam” or “no spam”. On the other hand, mathematically, it is the task of approximating a mapping function say f from input variables say X to the output variables say Y. A classic example of classification problem can be the spam detection in e-mails. It is obvious that there can be only two categories of output, “spam” and “no spam”. To implement such classification, we first need to do training of the classifier where “spam” and “no spam” emails would be used as the training data. Once, the classifier trained successfully, it can be used to detect an unknown email. Here, we are going to create a 4-5-3 NN using iris flower dataset having the following − 4-input nodes (one for each predictor value). 5-hidden processing nodes. 3-output nodes (because there are three possible species in iris dataset). Loading Dataset We will be using iris flower dataset, from which we want to classify species of iris flowers based on the physical properties of sepal width and length, and petal width and length. The dataset describes the physical properties of different varieties of iris flowers − Sepal length Sepal width Petal length Petal width Class i.e. iris setosa or iris versicolor or iris virginica We have iris.CSV file which we used before in previous chapters also. It can be loaded with the help of Pandas library. But, before using it or loading it for our classifier, we need to prepare the training and test files, so that it can be used easily with CNTK. Preparing training & test files Iris dataset is one of the most popular datasets for ML projects. It has 150 data items and the raw data looks as follows − 5.1 3.5 1.4 0.2 setosa 4.9 3.0 1.4 0.2 setosa … 7.0 3.2 4.7 1.4 versicolor 6.4 3.2 4.5 1.5 versicolor … 6.3 3.3 6.0 2.5 virginica 5.8 2.7 5.1 1.9 virginica As told earlier, the first four values on each line describes the physical properties of different varieties, i.e. Sepal length, Sepal width, Petal length, Petal width of iris flowers. But, we should have to convert the data in the format, that can be easily used by CNTK and that format is .ctf file (we created one iris.ctf in previous section also). It will look like as follows − |attribs 5.1 3.5 1.4 0.2|species 1 0 0 |attribs 4.9 3.0 1.4 0.2|species 1 0 0 … |attribs 7.0 3.2 4.7 1.4|species 0 1 0 |attribs 6.4 3.2 4.5 1.5|species 0 1 0 … |attribs 6.3 3.3 6.0 2.5|species 0 0 1 |attribs 5.8 2.7 5.1 1.9|species 0 0 1 In the above data, the |attribs tag mark the start of the feature value and the |species tags the class label values. We can also use any other tag names of our wish, even we can add item ID as well. For example, look at the following data − |ID 001 |attribs 5.1 3.5 1.4 0.2|species 1 0 0 |#setosa |ID 002 |attribs 4.9 3.0 1.4 0.2|species 1 0 0 |#setosa … |ID 051 |attribs 7.0 3.2 4.7 1.4|species 0 1 0 |#versicolor |ID 052 |attribs 6.4 3.2 4.5 1.5|species 0 1 0 |#versicolor … There are total 150 data items in iris dataset and for this example, we will be using 80-20 hold-out dataset rule i.e. 80% (120 items) data items for training purpose and remaining 20% (30 items) data items for testing purpose. Constructing Classification model First, we need to process the data files in CNTK format and for that we are going to use the helper function named create_reader as follows − def create_reader(path, input_dim, output_dim, rnd_order, sweeps): x_strm = C.io.StreamDef(field=”attribs”, shape=input_dim, is_sparse=False) y_strm = C.io.StreamDef(field=”species”, shape=output_dim, is_sparse=False) streams = C.io.StreamDefs(x_src=x_strm, y_src=y_strm) deserial = C.io.CTFDeserializer(path, streams) mb_src = C.io.MinibatchSource(deserial, randomize=rnd_order, max_sweeps=sweeps) return mb_src Now, we need to set the architecture arguments for our NN and also provide the location of the data files. It can be done with the help of following python code − def main(): print(“Using CNTK version = ” + str(C.__version__) + “n”) input_dim = 4 hidden_dim = 5 output_dim = 3 train_file = “.\…\” #provide the name of the training file(120 data items) test_file = “.\…\” #provide the name of the test file(30 data items) Now, with the help of following code line our program will create the untrained NN − X = C.ops.input_variable(input_dim, np.float32) Y = C.ops.input_variable(output_dim, np.float32) with C.layers.default_options(init=C.initializer.uniform(scale=0.01, seed=1)): hLayer = C.layers.Dense(hidden_dim, activation=C.ops.tanh, name=”hidLayer”)(X) oLayer = C.layers.Dense(output_dim, activation=None, name=”outLayer”)(hLayer) nnet = oLayer model = C.ops.softmax(nnet) Now, once we created the dual untrained model, we need to set up a Learner algorithm object and afterwards use it to create a Trainer training object. We are going to use SGD learner and cross_entropy_with_softmax loss function − tr_loss = C.cross_entropy_with_softmax(nnet, Y) tr_clas = C.classification_error(nnet, Y) max_iter = 2000 batch_size = 10 learn_rate = 0.01 learner = C.sgd(nnet.parameters, learn_rate) trainer = C.Trainer(nnet, (tr_loss, tr_clas), [learner]) Code the learning algorithm as follows − max_iter = 2000 batch_size = 10 lr_schedule = C.learning_parameter_schedule_per_sample([(1000, 0.05), (1, 0.01)]) mom_sch = C.momentum_schedule([(100, 0.99), (0, 0.95)], batch_size) learner = C.fsadagrad(nnet.parameters, lr=lr_schedule, momentum=mom_sch) trainer = C.Trainer(nnet, (tr_loss, tr_clas), [learner]) Now, once we finished with Trainer object, we need to create a reader function to read the training data− rdr = create_reader(train_file, input_dim, output_dim, rnd_order=True, sweeps=C.io.INFINITELY_REPEAT) iris_input_map = { X : rdr.streams.x_src, Y : rdr.streams.y_src } Now it’s time to train our NN model− for i in range(0, max_iter): curr_batch = rdr.next_minibatch(batch_size, input_map=iris_input_map) trainer.train_minibatch(curr_batch) if i % 500 == 0: mcee = trainer.previous_minibatch_loss_average macc = (1.0 – trainer.previous_minibatch_evaluation_average) * 100 print(“batch %4d: mean loss = %0.4f, accuracy = %0.2f%% ” % (i, mcee,