Theano – Quick Guide Theano – Introduction Have you developed Machine Learning models in Python? Then, obviously you know the intricacies in developing these models. The development is typically a slow process taking hours and days of computational power. The Machine Learning model development requires lot of mathematical computations. These generally require arithmetic computations especially large matrices of multiple dimensions. These days we use Neural Networks rather than the traditional statistical techniques for developing Machine Learning applications. The Neural Networks need to be trained over a huge amount of data. The training is done in batches of data of reasonable size. Thus, the learning process is iterative. Thus, if the computations are not done efficiently, training the network can take several hours or even days. Thus, the optimization of the executable code is highly desired. And that is what exactly Theano provides. Theano is a Python library that lets you define mathematical expressions used in Machine Learning, optimize these expressions and evaluate those very efficiently by decisively using GPUs in critical areas. It can rival typical full C-implementations in most of the cases. Theano was written at the LISA lab with the intention of providing rapid development of efficient machine learning algorithms. It is released under a BSD license. In this tutorial, you will learn to use Theano library. Theano – Installation Theano can be installed on Windows, MacOS, and Linux. The installation in all the cases is trivial. Before you install Theano, you must install its dependencies. The following is the list of dependencies − Python NumPy − Required SciPy − Required only for Sparse Matrix and special functions BLAS − Provides standard building blocks for performing basic vector and matrix operations The optional packages that you may choose to install depending on your needs are − nose: To run Theano’s test-suite Sphinx − For building documentation Graphiz and pydot − To handle graphics and images NVIDIA CUDA drivers − Required for GPU code generation/execution libgpuarray − Required for GPU/CPU code generation on CUDA and OpenCL devices We shall discuss the steps to install Theano in MacOS. MacOS Installation To install Theano and its dependencies, you use pip from the command line as follows. These are the minimal dependencies that we are going to need in this tutorial. $ pip install Theano $ pip install numpy $ pip install scipy $ pip install pydot You also need to install OSx command line developer tool using the following command − $ xcode-select –install You will see the following screen. Click on the Install button to install the tool. On successful installation, you will see the success message on the console. Testing the Installation After the installation completes successfully, open a new notebook in the Anaconda Jupyter. In the code cell, enter the following Python script − Example import theano from theano import tensor a = tensor.dscalar() b = tensor.dscalar() c = a + b f = theano.function([a,b], c) d = f(1.5, 2.5) print (d) Output Execute the script and you should see the following output − 4.0 The screenshot of the execution is shown below for your quick reference − If you get the above output, your Theano installation is successful. If not, follow the debug instructions on Theano download page to fix the issues. What is Theano? Now that you have successfully installed Theano, let us first try to understand what is Theano? Theano is a Python library. It lets you define, optimize, and evaluate mathematical expressions, especially the ones which are used in Machine Learning Model development. Theano itself does not contain any pre-defined ML models; it just facilitates its development. It is especially useful while dealing with multi-dimensional arrays. It seamlessly integrates with NumPy, which is a fundamental and widely used package for scientific computations in Python. Theano facilitates defining mathematical expressions used in ML development. Such expressions generally involve Matrix Arithmetic, Differentiation, Gradient Computation, and so on. Theano first builds the entire Computational Graph for your model. It then compiles it into highly efficient code by applying several optimization techniques on the graph. The compiled code is injected into Theano runtime by a special operation called function available in Theano. We execute this function repetitively to train a neural network. The training time is substantially reduced as compared to using pure Python coding or even a full C implementation. We shall now understand the process of Theano development. Let us begin with how to define a mathematical expression in Theano. Theano – A Trivial Theano Expression Let us begin our journey of Theano by defining and evaluating a trivial expression in Theano. Consider the following trivial expression that adds two scalars − c = a + b Where a, b are variables and c is the expression output. In Theano, defining and evaluating even this trivial expression is tricky. Let us understand the steps to evaluate the above expression. Importing Theano First, we need to import Theano library in our program, which we do using the following statement − from theano import * Rather than importing the individual packages, we have used * in the above statement to include all packages from the Theano library. Declaring Variables Next, we will declare a variable called a using the following statement − a = tensor.dscalar() The dscalar method declares a decimal scalar variable. The execution of the above statement creates a variable called a in your program code. Likewise, we will create variable b using the following statement − b = tensor.dscalar() Defining Expression Next, we will define our expression that operates on these two variables a and b. c = a + b In Theano, the execution of the above statement does not perform the scalar addition of the two variables a and b. Defining Theano Function To evaluate the above expression, we need to define a function in Theano as follows − f = theano.function([a,b], c) The function function takes two arguments, the first argument is an input to the function and the second one
Category: Machine Learning
Machine Learning – Deep Learning In the world of artificial intelligence, two terms that are often used interchangeably are machine learning and deep learning. While both of these technologies are used to create intelligent systems, they are not the same thing. In this article, we will explore the differences between machine learning and deep learning and how they are related. We understood about machine learning in last section so let”s see what deep learning is. What is Deep Learning? Deep learning is a type of machine learning that uses neural networks to process complex data. In other words, deep learning is a process by which computers can automatically learn patterns and relationships in data using multiple layers of interconnected nodes, or artificial neurons. Deep learning algorithms are designed to detect and learn from patterns in data to make predictions or decisions. Deep learning is particularly well-suited to tasks that involve processing complex data, such as image and speech recognition, natural language processing, and self-driving cars. Deep learning algorithms are able to process vast amounts of data and can learn to recognize complex patterns and relationships in that data. Examples of deep learning include facial recognition, voice recognition, and self-driving cars. Machine Learning vs. Deep Learning Now that we have a basic understanding of what machine learning and deep learning are, let”s dive deeper into the differences between the two. Firstly, machine learning is a broad category that encompasses many different types of algorithms, including deep learning. Deep learning is a specific type of machine learning algorithm that uses neural networks to process complex data. Secondly, while machine learning algorithms are designed to learn from data and improve their accuracy over time, deep learning algorithms are designed to process complex data and recognize patterns and relationships in that data. Deep learning algorithms are able to recognize complex patterns and relationships that other machine learning algorithms may not be able to detect. Thirdly, deep learning algorithms require a lot of data and processing power to train. Deep learning algorithms typically require large datasets and powerful hardware, such as graphics processing units (GPUs), to train effectively. Machine learning algorithms, on the other hand, can be trained on smaller datasets and less powerful hardware. Finally, deep learning algorithms can provide highly accurate predictions and decisions, but they can be more difficult to understand and interpret than other machine learning algorithms. Deep learning algorithms can process vast amounts of data and recognize complex patterns and relationships in that data, but it can be difficult to understand how the algorithm arrived at its conclusion.
TensorFlow – Forming Graphs A partial differential equation (PDE) is a differential equation, which involves partial derivatives with unknown function of several independent variables. With reference to partial differential equations, we will focus on creating new graphs. Let us assume there is a pond with dimension 500*500 square − N = 500 Now, we will compute partial differential equation and form the respective graph using it. Consider the steps given below for computing graph. Step 1 − Import libraries for simulation. import tensorflow as tf import numpy as np import matplotlib.pyplot as plt Step 2 − Include functions for transformation of a 2D array into a convolution kernel and simplified 2D convolution operation. def make_kernel(a): a = np.asarray(a) a = a.reshape(list(a.shape) + [1,1]) return tf.constant(a, dtype=1) def simple_conv(x, k): “””A simplified 2D convolution operation””” x = tf.expand_dims(tf.expand_dims(x, 0), -1) y = tf.nn.depthwise_conv2d(x, k, [1, 1, 1, 1], padding = ”SAME”) return y[0, :, :, 0] def laplace(x): “””Compute the 2D laplacian of an array””” laplace_k = make_kernel([[0.5, 1.0, 0.5], [1.0, -6., 1.0], [0.5, 1.0, 0.5]]) return simple_conv(x, laplace_k) sess = tf.InteractiveSession() Step 3 − Include the number of iterations and compute the graph to display the records accordingly. N = 500 # Initial Conditions — some rain drops hit a pond # Set everything to zero u_init = np.zeros([N, N], dtype = np.float32) ut_init = np.zeros([N, N], dtype = np.float32) # Some rain drops hit a pond at random points for n in range(100): a,b = np.random.randint(0, N, 2) u_init[a,b] = np.random.uniform() plt.imshow(u_init) plt.show() # Parameters: # eps — time resolution # damping — wave damping eps = tf.placeholder(tf.float32, shape = ()) damping = tf.placeholder(tf.float32, shape = ()) # Create variables for simulation state U = tf.Variable(u_init) Ut = tf.Variable(ut_init) # Discretized PDE update rules U_ = U + eps * Ut Ut_ = Ut + eps * (laplace(U) – damping * Ut) # Operation to update the state step = tf.group(U.assign(U_), Ut.assign(Ut_)) # Initialize state to initial conditions tf.initialize_all_variables().run() # Run 1000 steps of PDE for i in range(1000): # Step simulation step.run({eps: 0.03, damping: 0.04}) # Visualize every 50 steps if i % 500 == 0: plt.imshow(U.eval()) plt.show() The graphs are plotted as shown below −
TensorFlow – CNN And RNN Difference In this chapter, we will focus on the difference between CNN and RNN − CNN RNN It is suitable for spatial data such as images. RNN is suitable for temporal data, also called sequential data. CNN is considered to be more powerful than RNN. RNN includes less feature compatibility when compared to CNN. This network takes fixed size inputs and generates fixed size outputs. RNN can handle arbitrary input/output lengths. CNN is a type of feed-forward artificial neural network with variations of multilayer perceptrons designed to use minimal amounts of preprocessing. RNN unlike feed forward neural networks – can use their internal memory to process arbitrary sequences of inputs. CNNs use connectivity pattern between the neurons. This is inspired by the organization of the animal visual cortex, whose individual neurons are arranged in such a way that they respond to overlapping regions tiling the visual field. Recurrent neural networks use time-series information – what a user spoke last will impact what he/she will speak next. CNNs are ideal for images and video processing. RNNs are ideal for text and speech analysis. Following illustration shows the schematic representation of CNN and RNN −
TensorFlow – XOR Implementation In this chapter, we will learn about the XOR implementation using TensorFlow. Before starting with XOR implementation in TensorFlow, let us see the XOR table values. This will help us understand encryption and decryption process. A B A XOR B 0 0 0 0 1 1 1 0 1 1 1 0 XOR Cipher encryption method is basically used to encrypt data which is hard to crack with brute force method, i.e., by generating random encryption keys which match the appropriate key. The concept of implementation with XOR Cipher is to define a XOR encryption key and then perform XOR operation of the characters in the specified string with this key, which a user tries to encrypt. Now we will focus on XOR implementation using TensorFlow, which is mentioned below − #Declaring necessary modules import tensorflow as tf import numpy as np “”” A simple numpy implementation of a XOR gate to understand the backpropagation algorithm “”” x = tf.placeholder(tf.float64,shape = [4,2],name = “x”) #declaring a place holder for input x y = tf.placeholder(tf.float64,shape = [4,1],name = “y”) #declaring a place holder for desired output y m = np.shape(x)[0]#number of training examples n = np.shape(x)[1]#number of features hidden_s = 2 #number of nodes in the hidden layer l_r = 1#learning rate initialization theta1 = tf.cast(tf.Variable(tf.random_normal([3,hidden_s]),name = “theta1”),tf.float64) theta2 = tf.cast(tf.Variable(tf.random_normal([hidden_s+1,1]),name = “theta2”),tf.float64) #conducting forward propagation a1 = tf.concat([np.c_[np.ones(x.shape[0])],x],1) #the weights of the first layer are multiplied by the input of the first layer z1 = tf.matmul(a1,theta1) #the input of the second layer is the output of the first layer, passed through the activation function and column of biases is added a2 = tf.concat([np.c_[np.ones(x.shape[0])],tf.sigmoid(z1)],1) #the input of the second layer is multiplied by the weights z3 = tf.matmul(a2,theta2) #the output is passed through the activation function to obtain the final probability h3 = tf.sigmoid(z3) cost_func = -tf.reduce_sum(y*tf.log(h3)+(1-y)*tf.log(1-h3),axis = 1) #built in tensorflow optimizer that conducts gradient descent using specified learning rate to obtain theta values optimiser = tf.train.GradientDescentOptimizer(learning_rate = l_r).minimize(cost_func) #setting required X and Y values to perform XOR operation X = [[0,0],[0,1],[1,0],[1,1]] Y = [[0],[1],[1],[0]] #initializing all variables, creating a session and running a tensorflow session init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) #running gradient descent for each iteration and printing the hypothesis obtained using the updated theta values for i in range(100000): sess.run(optimiser, feed_dict = {x:X,y:Y})#setting place holder values using feed_dict if i%100==0: print(“Epoch:”,i) print(“Hyp:”,sess.run(h3,feed_dict = {x:X,y:Y})) The above line of code generates an output as shown in the screenshot below −
TensorFlow – Optimizers Optimizers are the extended class, which include added information to train a specific model. The optimizer class is initialized with given parameters but it is important to remember that no Tensor is needed. The optimizers are used for improving speed and performance for training a specific model. The basic optimizer of TensorFlow is − tf.train.Optimizer This class is defined in the specified path of tensorflow/python/training/optimizer.py. Following are some optimizers in Tensorflow − Stochastic Gradient descent Stochastic Gradient descent with gradient clipping Momentum Nesterov momentum Adagrad Adadelta RMSProp Adam Adamax SMORMS3 We will focus on the Stochastic Gradient descent. The illustration for creating optimizer for the same is mentioned below − def sgd(cost, params, lr = np.float32(0.01)): g_params = tf.gradients(cost, params) updates = [] for param, g_param in zip(params, g_params): updates.append(param.assign(param – lr*g_param)) return updates The basic parameters are defined within the specific function. In our subsequent chapter, we will focus on Gradient Descent Optimization with implementation of optimizers.
TensorFlow – Gradient Descent Optimization Gradient descent optimization is considered to be an important concept in data science. Consider the steps shown below to understand the implementation of gradient descent optimization − Step 1 Include necessary modules and declaration of x and y variables through which we are going to define the gradient descent optimization. import tensorflow as tf x = tf.Variable(2, name = ”x”, dtype = tf.float32) log_x = tf.log(x) log_x_squared = tf.square(log_x) optimizer = tf.train.GradientDescentOptimizer(0.5) train = optimizer.minimize(log_x_squared) Step 2 Initialize the necessary variables and call the optimizers for defining and calling it with respective function. init = tf.initialize_all_variables() def optimize(): with tf.Session() as session: session.run(init) print(“starting at”, “x:”, session.run(x), “log(x)^2:”, session.run(log_x_squared)) for step in range(10): session.run(train) print(“step”, step, “x:”, session.run(x), “log(x)^2:”, session.run(log_x_squared)) optimize() The above line of code generates an output as shown in the screenshot below − We can see that the necessary epochs and iterations are calculated as shown in the output.
TensorFlow – Linear Regression In this chapter, we will focus on the basic example of linear regression implementation using TensorFlow. Logistic regression or linear regression is a supervised machine learning approach for the classification of order discrete categories. Our goal in this chapter is to build a model by which a user can predict the relationship between predictor variables and one or more independent variables. The relationship between these two variables is cons −idered linear. If y is the dependent variable and x is considered as the independent variable, then the linear regression relationship of two variables will look like the following equation − Y = Ax+b We will design an algorithm for linear regression. This will allow us to understand the following two important concepts − Cost Function Gradient descent algorithms The schematic representation of linear regression is mentioned below − The graphical view of the equation of linear regression is mentioned below − Steps to design an algorithm for linear regression We will now learn about the steps that help in designing an algorithm for linear regression. Step 1 It is important to import the necessary modules for plotting the linear regression module. We start importing the Python library NumPy and Matplotlib. import numpy as np import matplotlib.pyplot as plt Step 2 Define the number of coefficients necessary for logistic regression. number_of_points = 500 x_point = [] y_point = [] a = 0.22 b = 0.78 Step 3 Iterate the variables for generating 300 random points around the regression equation − Y = 0.22x+0.78 for i in range(number_of_points): x = np.random.normal(0.0,0.5) y = a*x + b +np.random.normal(0.0,0.1) x_point.append([x]) y_point.append([y]) Step 4 View the generated points using Matplotlib. fplt.plot(x_point,y_point, ”o”, label = ”Input Data”) plt.legend() plt.show() The complete code for logistic regression is as follows − import numpy as np import matplotlib.pyplot as plt number_of_points = 500 x_point = [] y_point = [] a = 0.22 b = 0.78 for i in range(number_of_points): x = np.random.normal(0.0,0.5) y = a*x + b +np.random.normal(0.0,0.1) x_point.append([x]) y_point.append([y]) plt.plot(x_point,y_point, ”o”, label = ”Input Data”) plt.legend() plt.show() The number of points which is taken as input is considered as input data.
Machine Learning – Getting Datasets Machine learning models are only as good as the data they are trained on. Therefore, obtaining good quality and relevant datasets is a critical step in the machine learning process. Let”s see some different sources of datasets for machine learning and how to obtain them. Public Datasets There are many publicly available datasets that you can use for machine learning. Some of the popular sources of public datasets include Kaggle, UCI Machine Learning Repository, Google Dataset Search, and AWS Public Datasets. These datasets are often used for research and are open to the public. Data Scraping Data scraping involves automatically extracting data from websites or other sources. It can be a useful way to obtain data that is not available as a pre-packaged dataset. However, it is important to ensure that the data is being scraped ethically and legally, and that the source is reliable and accurate. Data Purchase In some cases, it may be necessary to purchase a dataset for machine learning. Many companies sell pre-packaged datasets that are tailored to specific industries or use cases. Before purchasing a dataset, it is important to evaluate its quality and relevance to your machine learning project. Data Collection Data collection involves manually collecting data from various sources. This can be time-consuming and requires careful planning to ensure that the data is accurate and relevant to your machine learning project. It may involve surveys, interviews, or other forms of data collection. Strategies for Acquiring High Quality Datasets Once you have identified the source of your dataset, it is important to ensure that the data is of good quality and relevant to your machine learning project. Below are some Strategies for obtaining good quality datasets − Identify the Problem You Want to Solve Before obtaining a dataset, it is important to identify the problem you want to solve with machine learning. This will help you determine the type of data you need and where to obtain it. Determine the Size of the Dataset The size of the dataset depends on the complexity of the problem you are trying to solve. Generally, the more data you have, the better your machine learning model will perform. However, it is important to ensure that the dataset is not too large and contains irrelevant or duplicate data. Ensure the Data is Relevant and Accurate It is important to ensure that the data is relevant and accurate to the problem you are trying to solve. Ensure that the data is from a reliable source and that it has been verified. Preprocess the Data Preprocessing the data involves cleaning, normalizing, and transforming the data to prepare it for machine learning. This step is critical to ensure that the machine learning model can understand and use the data effectively.
Theano – Installation Theano can be installed on Windows, MacOS, and Linux. The installation in all the cases is trivial. Before you install Theano, you must install its dependencies. The following is the list of dependencies − Python NumPy − Required SciPy − Required only for Sparse Matrix and special functions BLAS − Provides standard building blocks for performing basic vector and matrix operations The optional packages that you may choose to install depending on your needs are − nose: To run Theano’s test-suite Sphinx − For building documentation Graphiz and pydot − To handle graphics and images NVIDIA CUDA drivers − Required for GPU code generation/execution libgpuarray − Required for GPU/CPU code generation on CUDA and OpenCL devices We shall discuss the steps to install Theano in MacOS. MacOS Installation To install Theano and its dependencies, you use pip from the command line as follows. These are the minimal dependencies that we are going to need in this tutorial. $ pip install Theano $ pip install numpy $ pip install scipy $ pip install pydot You also need to install OSx command line developer tool using the following command − $ xcode-select –install You will see the following screen. Click on the Install button to install the tool. On successful installation, you will see the success message on the console. Testing the Installation After the installation completes successfully, open a new notebook in the Anaconda Jupyter. In the code cell, enter the following Python script − Example import theano from theano import tensor a = tensor.dscalar() b = tensor.dscalar() c = a + b f = theano.function([a,b], c) d = f(1.5, 2.5) print (d) Output Execute the script and you should see the following output − 4.0 The screenshot of the execution is shown below for your quick reference − If you get the above output, your Theano installation is successful. If not, follow the debug instructions on Theano download page to fix the issues. What is Theano? Now that you have successfully installed Theano, let us first try to understand what is Theano? Theano is a Python library. It lets you define, optimize, and evaluate mathematical expressions, especially the ones which are used in Machine Learning Model development. Theano itself does not contain any pre-defined ML models; it just facilitates its development. It is especially useful while dealing with multi-dimensional arrays. It seamlessly integrates with NumPy, which is a fundamental and widely used package for scientific computations in Python. Theano facilitates defining mathematical expressions used in ML development. Such expressions generally involve Matrix Arithmetic, Differentiation, Gradient Computation, and so on. Theano first builds the entire Computational Graph for your model. It then compiles it into highly efficient code by applying several optimization techniques on the graph. The compiled code is injected into Theano runtime by a special operation called function available in Theano. We execute this function repetitively to train a neural network. The training time is substantially reduced as compared to using pure Python coding or even a full C implementation. We shall now understand the process of Theano development. Let us begin with how to define a mathematical expression in Theano.