Python Deep Learning – Implementations ”; Previous Next In this implementation of Deep learning, our objective is to predict the customer attrition or churning data for a certain bank – which customers are likely to leave this bank service. The Dataset used is relatively small and contains 10000 rows with 14 columns. We are using Anaconda distribution, and frameworks like Theano, TensorFlow and Keras. Keras is built on top of Tensorflow and Theano which function as its backends. # Artificial Neural Network # Installing Theano pip install –upgrade theano # Installing Tensorflow pip install –upgrade tensorflow # Installing Keras pip install –upgrade keras Step 1: Data preprocessing In[]: # Importing the libraries import numpy as np import matplotlib.pyplot as plt import pandas as pd # Importing the database dataset = pd.read_csv(”Churn_Modelling.csv”) Step 2 We create matrices of the features of dataset and the target variable, which is column 14, labeled as “Exited”. The initial look of data is as shown below − In[]: X = dataset.iloc[:, 3:13].values Y = dataset.iloc[:, 13].values X Output Step 3 Y Output array([1, 0, 1, …, 1, 1, 0], dtype = int64) Step 4 We make the analysis simpler by encoding string variables. We are using the ScikitLearn function ‘LabelEncoder’ to automatically encode the different labels in the columns with values between 0 to n_classes-1. from sklearn.preprocessing import LabelEncoder, OneHotEncoder labelencoder_X_1 = LabelEncoder() X[:,1] = labelencoder_X_1.fit_transform(X[:,1]) labelencoder_X_2 = LabelEncoder() X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2]) X Output In the above output,country names are replaced by 0, 1 and 2; while male and female are replaced by 0 and 1. Step 5 Labelling Encoded Data We use the same ScikitLearn library and another function called the OneHotEncoder to just pass the column number creating a dummy variable. onehotencoder = OneHotEncoder(categorical features = [1]) X = onehotencoder.fit_transform(X).toarray() X = X[:, 1:] X Now, the first 2 columns represent the country and the 4th column represents the gender. Output We always divide our data into training and testing part; we train our model on training data and then we check the accuracy of a model on testing data which helps in evaluating the efficiency of model. Step 6 We are using ScikitLearn’s train_test_split function to split our data into training set and test set. We keep the train- to- test split ratio as 80:20. #Splitting the dataset into the Training set and the Test Set from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2) Some variables have values in thousands while some have values in tens or ones. We scale the data so that they are more representative. Step 7 In this code, we are fitting and transforming the training data using the StandardScaler function. We standardize our scaling so that we use the same fitted method to transform/scale test data. # Feature Scaling fromsklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.transform(X_test) Output The data is now scaled properly. Finally, we are done with our data pre-processing. Now,we will start with our model. Step 8 We import the required Modules here. We need the Sequential module for initializing the neural network and the dense module to add the hidden layers. # Importing the Keras libraries and packages import keras from keras.models import Sequential from keras.layers import Dense Step 9 We will name the model as Classifier as our aim is to classify customer churn. Then we use the Sequential module for initialization. #Initializing Neural Network classifier = Sequential() Step 10 We add the hidden layers one by one using the dense function. In the code below, we will see many arguments. Our first parameter is output_dim. It is the number of nodes we add to this layer. init is the initialization of the Stochastic Gradient Decent. In a Neural Network we assign weights to each node. At initialization, weights should be near to zero and we randomly initialize weights using the uniform function. The input_dim parameter is needed only for first layer, as the model does not know the number of our input variables. Here the total number of input variables is 11. In the second layer, the model automatically knows the number of input variables from the first hidden layer. Execute the following line of code to addthe input layer and the first hidden layer − classifier.add(Dense(units = 6, kernel_initializer = ”uniform”, activation = ”relu”, input_dim = 11)) Execute the following line of code to add the second hidden layer − classifier.add(Dense(units = 6, kernel_initializer = ”uniform”, activation = ”relu”)) Execute the following line of code to add the output layer − classifier.add(Dense(units = 1, kernel_initializer = ”uniform”, activation = ”sigmoid”)) Step 11 Compiling the ANN We have added multiple layers to our classifier until now. We will now compile them using the compile method. Arguments added in final compilation control complete the neural network.So,we need to be careful in this step. Here is a brief explanation of the arguments. First argument is Optimizer.This is an algorithm used to find the optimal set of weights. This algorithm is called the Stochastic Gradient Descent (SGD). Here we are using one among several types, called the ‘Adam optimizer’. The SGD depends on loss, so our second parameter is loss. If our dependent variable is binary, we use logarithmic loss function called ‘binary_crossentropy’, and if our dependent variable has more than two categories in output, then we use ‘categorical_crossentropy’. We want to improve performance of our neural network based on accuracy, so we add metrics as accuracy. # Compiling Neural Network classifier.compile(optimizer = ”adam”, loss = ”binary_crossentropy”, metrics = [”accuracy”]) Step 12 A number of codes need to be executed in this step. Fitting the ANN to the Training Set We now train our model on the training data. We use the fit method to fit our model. We also optimize the weights to improve model efficiency. For this, we have to update the weights. Batch size is the number of observations after which we update the weights.
Category: Machine Learning
Environment
Python Deep Learning – Environment ”; Previous Next In this chapter, we will learn about the environment set up for Python Deep Learning. We have to install the following software for making deep learning algorithms. Python 2.7+ Scipy with Numpy Matplotlib Theano Keras TensorFlow It is strongly recommend that Python, NumPy, SciPy, and Matplotlib are installed through the Anaconda distribution. It comes with all of those packages. We need to ensure that the different types of software are installed properly. Let us go to our command line program and type in the following command − $ python Python 3.6.3 |Anaconda custom (32-bit)| (default, Oct 13 2017, 14:21:34) [GCC 7.2.0] on linux Next, we can import the required libraries and print their versions − import numpy print numpy.__version__ Output 1.14.2 Installation of Theano, TensorFlow and Keras Before we begin with the installation of the packages − Theano, TensorFlow and Keras, we need to confirm if the pip is installed. The package management system in Anaconda is called the pip. To confirm the installation of pip, type the following in the command line − $ pip Once the installation of pip is confirmed, we can install TensorFlow and Keras by executing the following command − $pip install theano $pip install tensorflow $pip install keras Confirm the installation of Theano by executing the following line of code − $python –c “import theano: print (theano.__version__)” Output 1.0.1 Confirm the installation of Tensorflow by executing the following line of code − $python –c “import tensorflow: print tensorflow.__version__” Output 1.7.0 Confirm the installation of Keras by executing the following line of code − $python –c “import keras: print keras.__version__” Using TensorFlow backend Output 2.1.5 Print Page Previous Next Advertisements ”;
Applications
Python Deep Learning – Applications ”; Previous Next Deep learning has produced good results for a few applications such as computer vision, language translation, image captioning, audio transcription, molecular biology, speech recognition, natural language processing, self-driving cars, brain tumour detection, real-time speech translation, music composition, automatic game playing and so on. Deep learning is the next big leap after machine learning with a more advanced implementation. Currently, it is heading towards becoming an industry standard bringing a strong promise of being a game changer when dealing with raw unstructured data. Deep learning is currently one of the best solution providers fora wide range of real-world problems. Developers are building AI programs that, instead of using previously given rules, learn from examples to solve complicated tasks. With deep learning being used by many data scientists, deeper neural networks are delivering results that are ever more accurate. The idea is to develop deep neural networks by increasing the number of training layers for each network; machine learns more about the data until it is as accurate as possible. Developers can use deep learning techniques to implement complex machine learning tasks, and train AI networks to have high levels of perceptual recognition. Deep learning finds its popularity in Computer vision. Here one of the tasks achieved is image classification where given input images are classified as cat, dog, etc. or as a class or label that best describe the image. We as humans learn how to do this task very early in our lives and have these skills of quickly recognizing patterns, generalizing from prior knowledge, and adapting to different image environments. Print Page Previous Next Advertisements ”;
Python Deep Learning – Home
Python Deep Learning Tutorial PDF Version Quick Guide Resources Job Search Discussion Python is a general-purpose high level programming language that is widely used in data science and for producing deep learning algorithms. This brief tutorial introduces Python and its libraries like Numpy, Scipy, Pandas, Matplotlib; frameworks like Theano, TensorFlow, Keras. The tutorial explains how the different libraries and frameworks can be applied to solve complex real world problems. Audience This tutorial has been prepared for professionals aspiring to learn the basics of Python and develop applications involving deep learning techniques such as convolutional neural nets, recurrent nets, back propagation, etc. Prerequisites Before you proceed with this tutorial, we assume that you have prior exposure to Python, Numpy, Pandas, Scipy, Matplotib, Windows, any Linux distribution, prior basic knowledge of Linear Algebra, Calculus, Statistics and basic machine learning techniques. Print Page Previous Next Advertisements ”;
Artificial Neural Networks
Artificial Neural Networks ”; Previous Next The Artificial Neural Network, or just neural network for short, is not a new idea. It has been around for about 80 years. It was not until 2011, when Deep Neural Networks became popular with the use of new techniques, huge dataset availability, and powerful computers. A neural network mimics a neuron, which has dendrites, a nucleus, axon, and terminal axon. For a network, we need two neurons. These neurons transfer information via synapse between the dendrites of one and the terminal axon of another. A probable model of an artificial neuron looks like this − A neural network will look like as shown below − The circles are neurons or nodes, with their functions on the data and the lines/edges connecting them are the weights/information being passed along. Each column is a layer. The first layer of your data is the input layer. Then, all the layers between the input layer and the output layer are the hidden layers. If you have one or a few hidden layers, then you have a shallow neural network. If you have many hidden layers, then you have a deep neural network. In this model, you have input data, you weight it, and pass it through the function in the neuron that is called threshold function or activation function. Basically, it is the sum of all of the values after comparing it with a certain value. If you fire a signal, then the result is (1) out, or nothing is fired out, then (0). That is then weighted and passed along to the next neuron, and the same sort of function is run. We can have a sigmoid (s-shape) function as the activation function. As for the weights, they are just random to start, and they are unique per input into the node/neuron. In a typical “feed forward”, the most basic type of neural network, you have your information pass straight through the network you created, and you compare the output to what you hoped the output would have been using your sample data. From here, you need to adjust the weights to help you get your output to match your desired output. The act of sending data straight through a neural network is called a feed forward neural network. Our data goes from input, to the layers, in order, then to the output. When we go backwards and begin adjusting weights to minimize loss/cost, this is called back propagation. This is an optimization problem. With the neural network, in real practice, we have to deal with hundreds of thousands of variables, or millions, or more. The first solution was to use stochastic gradient descent as optimization method. Now, there are options like AdaGrad, Adam Optimizer and so on. Either way, this is a massive computational operation. That is why Neural Networks were mostly left on the shelf for over half a century. It was only very recently that we even had the power and architecture in our machines to even consider doing these operations, and the properly sized datasets to match. For simple classification tasks, the neural network is relatively close in performance to other simple algorithms like K Nearest Neighbors. The real utility of neural networks is realized when we have much larger data, and much more complex questions, both of which outperform other machine learning models. Print Page Previous Next Advertisements ”;
Basic Machine Learning
Python Deep Basic Machine Learning ”; Previous Next Artificial Intelligence (AI) is any code, algorithm or technique that enables a computer to mimic human cognitive behaviour or intelligence. Machine Learning (ML) is a subset of AI that uses statistical methods to enable machines to learn and improve with experience. Deep Learning is a subset of Machine Learning, which makes the computation of multi-layer neural networks feasible. Machine Learning is seen as shallow learning while Deep Learning is seen as hierarchical learning with abstraction. Machine learning deals with a wide range of concepts. The concepts are listed below − supervised unsupervised reinforcement learning linear regression cost functions overfitting under-fitting hyper-parameter, etc. In supervised learning, we learn to predict values from labelled data. One ML technique that helps here is classification, where target values are discrete values; for example,cats and dogs. Another technique in machine learning that could come of help is regression. Regression works onthe target values. The target values are continuous values; for example, the stock market data can be analysed using Regression. In unsupervised learning, we make inferences from the input data that is not labelled or structured. If we have a million medical records and we have to make sense of it, find the underlying structure, outliers or detect anomalies, we use clustering technique to divide data into broad clusters. Data sets are divided into training sets, testing sets, validation sets and so on. A breakthrough in 2012 brought the concept of Deep Learning into prominence. An algorithm classified 1 million images into 1000 categories successfully using 2 GPUs and latest technologies like Big Data. Relating Deep Learning and Traditional Machine Learning One of the major challenges encountered in traditional machine learning models is a process called feature extraction. The programmer needs to be specific and tell the computer the features to be looked out for. These features will help in making decisions. Entering raw data into the algorithm rarely works, so feature extraction is a critical part of the traditional machine learning workflow. This places a huge responsibility on the programmer, and the algorithm”s efficiency relies heavily on how inventive the programmer is. For complex problems such as object recognition or handwriting recognition, this is a huge issue. Deep learning, with the ability to learn multiple layers of representation, is one of the few methods that has help us with automatic feature extraction. The lower layers can be assumed to be performing automatic feature extraction, requiring little or no guidance from the programmer. Print Page Previous Next Advertisements ”;
Introduction
Python Deep Learning – Introduction ”; Previous Next Deep structured learning or hierarchical learning or deep learning in short is part of the family of machine learning methods which are themselves a subset of the broader field of Artificial Intelligence. Deep learning is a class of machine learning algorithms that use several layers of nonlinear processing units for feature extraction and transformation. Each successive layer uses the output from the previous layer as input. Deep neural networks, deep belief networks and recurrent neural networks have been applied to fields such as computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, and bioinformatics where they produced results comparable to and in some cases better than human experts have. Deep Learning Algorithms and Networks − are based on the unsupervised learning of multiple levels of features or representations of the data. Higher-level features are derived from lower level features to form a hierarchical representation. use some form of gradient descent for training. Print Page Previous Next Advertisements ”;
Training a Neural Network
Training a Neural Network ”; Previous Next We will now learn how to train a neural network. We will also learn back propagation algorithm and backward pass in Python Deep Learning. We have to find the optimal values of the weights of a neural network to get the desired output. To train a neural network, we use the iterative gradient descent method. We start initially with random initialization of the weights. After random initialization, we make predictions on some subset of the data with forward-propagation process, compute the corresponding cost function C, and update each weight w by an amount proportional to dC/dw, i.e., the derivative of the cost functions w.r.t. the weight. The proportionality constant is known as the learning rate. The gradients can be calculated efficiently using the back-propagation algorithm. The key observation of backward propagation or backward prop is that because of the chain rule of differentiation, the gradient at each neuron in the neural network can be calculated using the gradient at the neurons, it has outgoing edges to. Hence, we calculate the gradients backwards, i.e., first calculate the gradients of the output layer, then the top-most hidden layer, followed by the preceding hidden layer, and so on, ending at the input layer. The back-propagation algorithm is implemented mostly using the idea of a computational graph, where each neuron is expanded to many nodes in the computational graph and performs a simple mathematical operation like addition, multiplication. The computational graph does not have any weights on the edges; all weights are assigned to the nodes, so the weights become their own nodes. The backward propagation algorithm is then run on the computational graph. Once the calculation is complete, only the gradients of the weight nodes are required for update. The rest of the gradients can be discarded. Gradient Descent Optimization Technique One commonly used optimization function that adjusts weights according to the error they caused is called the “gradient descent.” Gradient is another name for slope, and slope, on an x-y graph, represents how two variables are related to each other: the rise over the run, the change in distance over the change in time, etc. In this case, the slope is the ratio between the network’s error and a single weight; i.e., how does the error change as the weight is varied. To put it more precisely, we want to find which weight produces the least error. We want to find the weight that correctly represents the signals contained in the input data, and translates them to a correct classification. As a neural network learns, it slowly adjusts many weights so that they can map signal to meaning correctly. The ratio between network Error and each of those weights is a derivative, dE/dw that calculates the extent to which a slight change in a weight causes a slight change in the error. Each weight is just one factor in a deep network that involves many transforms; the signal of the weight passes through activations and sums over several layers, so we use the chain rule of calculus to work back through the network activations and outputs.This leads us to the weight in question, and its relationship to overall error. Given two variables, error and weight, are mediated by a third variable, activation, through which the weight is passed. We can calculate how a change in weight affects a change in error by first calculating how a change in activation affects a change in Error, and how a change in weight affects a change in activation. The basic idea in deep learning is nothing more than that: adjusting a model’s weights in response to the error it produces, until you cannot reduce the error any more. The deep net trains slowly if the gradient value is small and fast if the value is high. Any inaccuracies in training leads to inaccurate outputs. The process of training the nets from the output back to the input is called back propagation or back prop. We know that forward propagation starts with the input and works forward. Back prop does the reverse/opposite calculating the gradient from right to left. Each time we calculate a gradient, we use all the previous gradients up to that point. Let us start at a node in the output layer. The edge uses the gradient at that node. As we go back into the hidden layers, it gets more complex. The product of two numbers between 0 and 1 gives youa smaller number. The gradient value keeps getting smaller and as a result back prop takes a lot of time to train and accuracy suffers. Challenges in Deep Learning Algorithms There are certain challenges for both shallow neural networks and deep neural networks, like overfitting and computation time. DNNs are affected by overfitting because the use of added layers of abstraction which allow them to model rare dependencies in the training data. Regularization methods such as drop out, early stopping, data augmentation, transfer learning are applied during training to combat overfitting. Drop out regularization randomly omits units from the hidden layers during training which helps in avoiding rare dependencies. DNNs take into consideration several training parameters such as the size, i.e., the number of layers and the number of units per layer, the learning rate and initial weights. Finding optimal parameters is not always practical due to the high cost in time and computational resources. Several hacks such as batching can speed up computation. The large processing power of GPUs has significantly helped the training process, as the matrix and vector computations required are well-executed on the GPUs. Dropout Dropout is a popular regularization technique for neural networks. Deep neural networks are particularly prone to overfitting. Let us now see what dropout is and how it works. In the words of Geoffrey Hinton, one of the pioneers of Deep Learning, ‘If you have a deep neural net and it”s not overfitting, you should probably be using a bigger one and
Libraries and Frameworks
Libraries and Frameworks ”; Previous Next In this chapter, we will relate deep learning to the different libraries and frameworks. Deep learning and Theano If we want to start coding a deep neural network, it is better we have an idea how different frameworks like Theano, TensorFlow, Keras, PyTorch etc work. Theano is python library which provides a set of functions for building deep nets that train quickly on our machine. Theano was developed at the University of Montreal, Canada under the leadership of Yoshua Bengio a deep net pioneer. Theano lets us define and evaluate mathematical expressions with vectors and matrices which are rectangular arrays of numbers. Technically speaking, both neural nets and input data can be represented as matrices and all standard net operations can be redefined as matrix operations. This is important since computers can carry out matrix operations very quickly. We can process multiple matrix values in parallel and if we build a neural net with this underlying structure, we can use a single machine with a GPU to train enormous nets in a reasonable time window. However if we use Theano, we have to build the deep net from ground up. The library does not provide complete functionality for creating a specific type of deep net. Instead, we have to code every aspect of the deep net like the model, the layers, the activation, the training method and any special methods to stop overfitting. The good news however is that Theano allows the building our implementation over a top of vectorized functions providing us with a highly optimized solution. There are many other libraries that extend the functionality of Theano. TensorFlow and Keras can be used with Theano as backend. Deep Learning with TensorFlow Googles TensorFlow is a python library. This library is a great choice for building commercial grade deep learning applications. TensorFlow grew out of another library DistBelief V2 that was a part of Google Brain Project. This library aims to extend the portability of machine learning so that research models could be applied to commercial-grade applications. Much like the Theano library, TensorFlow is based on computational graphs where a node represents persistent data or math operation and edges represent the flow of data between nodes, which is a multidimensional array or tensor; hence the name TensorFlow The output from an operation or a set of operations is fed as input into the next. Even though TensorFlow was designed for neural networks, it works well for other nets where computation can be modelled as data flow graph. TensorFlow also uses several features from Theano such as common and sub-expression elimination, auto differentiation, shared and symbolic variables. Different types of deep nets can be built using TensorFlow like convolutional nets, Autoencoders, RNTN, RNN, RBM, DBM/MLP and so on. However, there is no support for hyper parameter configuration in TensorFlow.For this functionality, we can use Keras. Deep Learning and Keras Keras is a powerful easy-to-use Python library for developing and evaluating deep learning models. It has a minimalist design that allows us to build a net layer by layer; train it, and run it. It wraps the efficient numerical computation libraries Theano and TensorFlow and allows us to define and train neural network models in a few short lines of code. It is a high-level neural network API, helping to make wide use of deep learning and artificial intelligence. It runs on top of a number of lower-level libraries including TensorFlow, Theano,and so on. Keras code is portable; we can implement a neural network in Keras using Theano or TensorFlow as a back ended without any changes in code. Print Page Previous Next Advertisements ”;
Deep Neural Networks
Deep Neural Networks ”; Previous Next A deep neural network (DNN) is an ANN with multiple hidden layers between the input and output layers. Similar to shallow ANNs, DNNs can model complex non-linear relationships. The main purpose of a neural network is to receive a set of inputs, perform progressively complex calculations on them, and give output to solve real world problems like classification. We restrict ourselves to feed forward neural networks. We have an input, an output, and a flow of sequential data in a deep network. Neural networks are widely used in supervised learning and reinforcement learning problems. These networks are based on a set of layers connected to each other. In deep learning, the number of hidden layers, mostly non-linear, can be large; say about 1000 layers. DL models produce much better results than normal ML networks. We mostly use the gradient descent method for optimizing the network and minimising the loss function. We can use the Imagenet, a repository of millions of digital images to classify a dataset into categories like cats and dogs. DL nets are increasingly used for dynamic images apart from static ones and for time series and text analysis. Training the data sets forms an important part of Deep Learning models. In addition, Backpropagation is the main algorithm in training DL models. DL deals with training large neural networks with complex input output transformations. One example of DL is the mapping of a photo to the name of the person(s) in photo as they do on social networks and describing a picture with a phrase is another recent application of DL. Neural networks are functions that have inputs like x1,x2,x3…that are transformed to outputs like z1,z2,z3 and so on in two (shallow networks) or several intermediate operations also called layers (deep networks). The weights and biases change from layer to layer. ‘w’ and ‘v’ are the weights or synapses of layers of the neural networks. The best use case of deep learning is the supervised learning problem.Here,we have large set of data inputs with a desired set of outputs. Here we apply back propagation algorithm to get correct output prediction. The most basic data set of deep learning is the MNIST, a dataset of handwritten digits. We can train deep a Convolutional Neural Network with Keras to classify images of handwritten digits from this dataset. The firing or activation of a neural net classifier produces a score. For example,to classify patients as sick and healthy,we consider parameters such as height, weight and body temperature, blood pressure etc. A high score means patient is sick and a low score means he is healthy. Each node in output and hidden layers has its own classifiers. The input layer takes inputs and passes on its scores to the next hidden layer for further activation and this goes on till the output is reached. This progress from input to output from left to right in the forward direction is called forward propagation. Credit assignment path (CAP) in a neural network is the series of transformations starting from the input to the output. CAPs elaborate probable causal connections between the input and the output. CAP depth for a given feed forward neural network or the CAP depth is the number of hidden layers plus one as the output layer is included. For recurrent neural networks, where a signal may propagate through a layer several times, the CAP depth can be potentially limitless. Deep Nets and Shallow Nets There is no clear threshold of depth that divides shallow learning from deep learning; but it is mostly agreed that for deep learning which has multiple non-linear layers, CAP must be greater than two. Basic node in a neural net is a perception mimicking a neuron in a biological neural network. Then we have multi-layered Perception or MLP. Each set of inputs is modified by a set of weights and biases; each edge has a unique weight and each node has a unique bias. The prediction accuracy of a neural net depends on its weights and biases. The process of improving the accuracy of neural network is called training. The output from a forward prop net is compared to that value which is known to be correct. The cost function or the loss function is the difference between the generated output and the actual output. The point of training is to make the cost of training as small as possible across millions of training examples.To do this, the network tweaks the weights and biases until the prediction matches the correct output. Once trained well, a neural net has the potential to make an accurate prediction every time. When the pattern gets complex and you want your computer to recognise them, you have to go for neural networks.In such complex pattern scenarios, neural network outperformsall other competing algorithms. There are now GPUs that can train them faster than ever before. Deep neural networks are already revolutionizing the field of AI Computers have proved to be good at performing repetitive calculations and following detailed instructions but have been not so good at recognising complex patterns. If there is the problem of recognition of simple patterns, a support vector machine (svm) or a logistic regression classifier can do the job well, but as the complexity of patternincreases, there is no way but to go for deep neural networks. Therefore, for complex patterns like a human face, shallow neural networks fail and have no alternative but to go for deep neural networks with more layers. The deep nets are able to do their job by breaking down the complex patterns into simpler ones. For example, human face; adeep net would use edges to detect parts like lips, nose, eyes, ears and so on and then re-combine these together to form a human face The accuracy of correct prediction has become so accurate that recently at a Google Pattern Recognition Challenge, a deep net beat a human. This idea of a