Theano Tutorial Job Search Theano is a Python library that lets you define mathematical expressions used in Machine Learning, optimize these expressions and evaluate those very efficiently by decisively using GPUs in critical areas. It can rival typical full C-implementations in most of the cases. Audience This tutorial is designed to help all those learners who are aiming to develop Deep Learning Projects. Prerequisites Before you proceed with this tutorial, prior exposure to Python, NumPy, Neural Networks, and Deep Learning is necessary.
Category: theano
Theano – Introduction Have you developed Machine Learning models in Python? Then, obviously you know the intricacies in developing these models. The development is typically a slow process taking hours and days of computational power. The Machine Learning model development requires lot of mathematical computations. These generally require arithmetic computations especially large matrices of multiple dimensions. These days we use Neural Networks rather than the traditional statistical techniques for developing Machine Learning applications. The Neural Networks need to be trained over a huge amount of data. The training is done in batches of data of reasonable size. Thus, the learning process is iterative. Thus, if the computations are not done efficiently, training the network can take several hours or even days. Thus, the optimization of the executable code is highly desired. And that is what exactly Theano provides. Theano is a Python library that lets you define mathematical expressions used in Machine Learning, optimize these expressions and evaluate those very efficiently by decisively using GPUs in critical areas. It can rival typical full C-implementations in most of the cases. Theano was written at the LISA lab with the intention of providing rapid development of efficient machine learning algorithms. It is released under a BSD license. In this tutorial, you will learn to use Theano library.
Theano – A Trivial Theano Expression Let us begin our journey of Theano by defining and evaluating a trivial expression in Theano. Consider the following trivial expression that adds two scalars − c = a + b Where a, b are variables and c is the expression output. In Theano, defining and evaluating even this trivial expression is tricky. Let us understand the steps to evaluate the above expression. Importing Theano First, we need to import Theano library in our program, which we do using the following statement − from theano import * Rather than importing the individual packages, we have used * in the above statement to include all packages from the Theano library. Declaring Variables Next, we will declare a variable called a using the following statement − a = tensor.dscalar() The dscalar method declares a decimal scalar variable. The execution of the above statement creates a variable called a in your program code. Likewise, we will create variable b using the following statement − b = tensor.dscalar() Defining Expression Next, we will define our expression that operates on these two variables a and b. c = a + b In Theano, the execution of the above statement does not perform the scalar addition of the two variables a and b. Defining Theano Function To evaluate the above expression, we need to define a function in Theano as follows − f = theano.function([a,b], c) The function function takes two arguments, the first argument is an input to the function and the second one is its output. The above declaration states that the first argument is of type array consisting of two elements a and b. The output is a scalar unit called c. This function will be referenced with the variable name f in our further code. Invoking Theano Function The call to the function f is made using the following statement − d = f(3.5, 5.5) The input to the function is an array consisting of two scalars: 3.5 and 5.5. The output of execution is assigned to the scalar variable d. To print the contents of d, we will use the print statement − print (d) The execution would cause the value of d to be printed on the console, which is 9.0 in this case. Full Program Listing The complete program listing is given here for your quick reference − from theano import * a = tensor.dscalar() b = tensor.dscalar() c = a + b f = theano.function([a,b], c) d = f(3.5, 5.5) print (d) Execute the above code and you will see the output as 9.0. The screen shot is shown here − Now, let us discuss a slightly more complex example that computes the multiplication of two matrices.
Theano – Computational Graph From the above two examples, you may have noticed that in Theano we create an expression which is eventually evaluated using the Theano function. Theano uses advanced optimization techniques to optimize the execution of an expression. To visualize the computation graph, Theano provides a printing package in its library. Symbolic Graph for Scalar Addition To see the computation graph for our scalar addition program, use the printing library as follows − theano.printing.pydotprint(f, outfile=”scalar_addition.png”, var_with_name_simple=True) When you execute this statement, a file called scalar_addition.png will be created on your machine. The saved computation graph is displayed here for your quick reference − The complete program listing to generate the above image is given below − from theano import * a = tensor.dscalar() b = tensor.dscalar() c = a + b f = theano.function([a,b], c) theano.printing.pydotprint(f, outfile=”scalar_addition.png”, var_with_name_simple=True) Symbolic Graph for Matrix Multiplier Now, try creating the computation graph for our matrix multiplier. The complete listing for generating this graph is given below − from theano import * a = tensor.dmatrix() b = tensor.dmatrix() c = tensor.dot(a,b) f = theano.function([a,b], c) theano.printing.pydotprint(f, outfile=”matrix_dot_product.png”, var_with_name_simple=True) The generated graph is shown here − Complex Graphs In larger expressions, the computational graphs could be very complex. One such graph taken from Theano documentation is shown here − To understand the working of Theano, it is important to first know the significance of these computational graphs. With this understanding, we shall know the importance of Theano. Why Theano? By looking at the complexity of the computational graphs, you will now be able to understand the purpose behind developing Theano. A typical compiler would provide local optimizations in the program as it never looks at the entire computation as a single unit. Theano implements very advanced optimization techniques to optimize the full computational graph. It combines the aspects of Algebra with aspects of an optimizing compiler. A part of the graph may be compiled into C-language code. For repeated calculations, the evaluation speed is critical and Theano meets this purpose by generating a very efficient code.
Theano – Trivial Training Example Theano is quite useful in training neural networks where we have to repeatedly calculate cost, and gradients to achieve an optimum. On large datasets, this becomes computationally intensive. Theano does this efficiently due to its internal optimizations of the computational graph that we have seen earlier. Problem Statement We shall now learn how to use Theano library to train a network. We will take a simple case where we start with a four feature dataset. We compute the sum of these features after applying a certain weight (importance) to each feature. The goal of the training is to modify the weights assigned to each feature so that the sum reaches a target value of 100. sum = f1 * w1 + f2 * w2 + f3 * w3 + f4 * w4 Where f1, f2, … are the feature values and w1, w2, … are the weights. Let me quantize the example for a better understanding of the problem statement. We will assume an initial value of 1.0 for each feature and we will take w1 equals 0.1, w2 equals 0.25, w3 equals 0.15, and w4 equals 0.3. There is no definite logic in assigning the weight values, it is just our intuition. Thus, the initial sum is as follows − sum = 1.0 * 0.1 + 1.0 * 0.25 + 1.0 * 0.15 + 1.0 * 0.3 Which sums to 0.8. Now, we will keep modifying the weight assignment so that this sum approaches 100. The current resultant value of 0.8 is far away from our desired target value of 100. In Machine Learning terms, we define cost as the difference between the target value minus the current output value, typically squared to blow up the error. We reduce this cost in each iteration by calculating the gradients and updating our weights vector. Let us see how this entire logic is implemented in Theano. Declaring Variables We first declare our input vector x as follows − x = tensor.fvector(”x”) Where x is a single dimensional array of float values. We define a scalar target variable as given below − target = tensor.fscalar(”target”) Next, we create a weights tensor W with the initial values as discussed above − W = theano.shared(numpy.asarray([0.1, 0.25, 0.15, 0.3]), ”W”) Defining Theano Expression We now calculate the output using the following expression − y = (x * W).sum() Note that in the above statement x and W are the vectors and not simple scalar variables. We now calculate the error (cost) with the following expression − cost = tensor.sqr(target – y) The cost is the difference between the target value and the current output, squared. To calculate the gradient which tells us how far we are from the target, we use the built-in grad method as follows − gradients = tensor.grad(cost, [W]) We now update the weights vector by taking a learning rate of 0.1 as follows − W_updated = W – (0.1 * gradients[0]) Next, we need to update our weights vector using the above values. We do this in the following statement − updates = [(W, W_updated)] Defining/Invoking Theano Function Lastly, we define a function in Theano to compute the sum. f = function([x, target], y, updates=updates) To invoke the above function a certain number of times, we create a for loop as follows − for i in range(10): output = f([1.0, 1.0, 1.0, 1.0], 100.0) As said earlier, the input to the function is a vector containing the initial values for the four features – we assign the value of 1.0 to each feature without any specific reason. You may assign different values of your choice and check if the function ultimately converges. We will print the values of the weight vector and the corresponding output in each iteration. It is shown in the below code − print (“iteration: “, i) print (“Modified Weights: “, W.get_value()) print (“Output: “, output) Full Program Listing The complete program listing is reproduced here for your quick reference − from theano import * import numpy x = tensor.fvector(”x”) target = tensor.fscalar(”target”) W = theano.shared(numpy.asarray([0.1, 0.25, 0.15, 0.3]), ”W”) print (“Weights: “, W.get_value()) y = (x * W).sum() cost = tensor.sqr(target – y) gradients = tensor.grad(cost, [W]) W_updated = W – (0.1 * gradients[0]) updates = [(W, W_updated)] f = function([x, target], y, updates=updates) for i in range(10): output = f([1.0, 1.0, 1.0, 1.0], 100.0) print (“iteration: “, i) print (“Modified Weights: “, W.get_value()) print (“Output: “, output) When you run the program you will see the following output − Weights: [0.1 0.25 0.15 0.3 ] iteration: 0 Modified Weights: [19.94 20.09 19.99 20.14] Output: 0.8 iteration: 1 Modified Weights: [23.908 24.058 23.958 24.108] Output: 80.16000000000001 iteration: 2 Modified Weights: [24.7016 24.8516 24.7516 24.9016] Output: 96.03200000000001 iteration: 3 Modified Weights: [24.86032 25.01032 24.91032 25.06032] Output: 99.2064 iteration: 4 Modified Weights: [24.892064 25.042064 24.942064 25.092064] Output: 99.84128 iteration: 5 Modified Weights: [24.8984128 25.0484128 24.9484128 25.0984128] Output: 99.968256 iteration: 6 Modified Weights: [24.89968256 25.04968256 24.94968256 25.09968256] Output: 99.9936512 iteration: 7 Modified Weights: [24.89993651 25.04993651 24.94993651 25.09993651] Output: 99.99873024 iteration: 8 Modified Weights: [24.8999873 25.0499873 24.9499873 25.0999873] Output: 99.99974604799999 iteration: 9 Modified Weights: [24.89999746 25.04999746 24.94999746 25.09999746] Output: 99.99994920960002 Observe that after four iterations, the output is 99.96 and after five iterations, it is 99.99, which is close to our desired target of 100.0. Depending on the desired accuracy, you may safely conclude that the network is trained in 4 to 5 iterations. After the training completes, look up the weights vector, which after 5 iterations takes the following values − iteration: 5 Modified Weights: [24.8984128 25.0484128 24.9484128 25.0984128] You may now use these values in your network for deploying the model.
Theano – Shared Variables Many a times, you would need to create variables which are shared between different functions and also between multiple calls to the same function. To cite an example, while training a neural network you create weights vector for assigning a weight to each feature under consideration. This vector is modified on every iteration during the network training. Thus, it has to be globally accessible across the multiple calls to the same function. So we create a shared variable for this purpose. Typically, Theano moves such shared variables to the GPU, provided one is available. This speeds up the computation. Syntax You create a shared variable you use the following syntax − import numpy W = theano.shared(numpy.asarray([0.1, 0.25, 0.15, 0.3]), ”W”) Example Here the NumPy array consisting of four floating point numbers is created. To set/get the W value you would use the following code snippet − import numpy W = theano.shared(numpy.asarray([0.1, 0.25, 0.15, 0.3]), ”W”) print (“Original: “, W.get_value()) print (“Setting new values (0.5, 0.2, 0.4, 0.2)”) W.set_value([0.5, 0.2, 0.4, 0.2]) print (“After modifications:”, W.get_value()) Output Original: [0.1 0.25 0.15 0.3 ] Setting new values (0.5, 0.2, 0.4, 0.2) After modifications: [0.5 0.2 0.4 0.2]
Discuss Theano Theano is a Python library that lets you define mathematical expressions used in Machine Learning, optimize these expressions and evaluate those very efficiently by decisively using GPUs in critical areas. It can rival typical full C-implementations in most of the cases.
Theano – Variables In the previous chapter, while discussing the data types, we created and used Theano variables. To reiterate, we would use the following syntax to create a variable in Theano − x = theano.tensor.fvector(”x”) In this statement, we have created a variable x of type vector containing 32-bit floats. We are also naming it as x. The names are generally useful for debugging. To declare a vector of 32-bit integers, you would use the following syntax − i32 = theano.tensor.ivector Here, we do not specify a name for the variable. To declare a three-dimensional vector consisting of 64-bit floats, you would use the following declaration − f64 = theano.tensor.dtensor3 The various types of constructors along with their data types are listed in the table below − Constructor Data type Dimensions fvector float32 1 ivector int32 1 fscalar float32 0 fmatrix float32 2 ftensor3 float32 3 dtensor3 float64 3 You may use a generic vector constructor and specify the data type explicitly as follows − x = theano.tensor.vector (”x”, dtype=int32) In the next chapter, we will learn how to create shared variables.
Theano – Functions Theano function acts like a hook for interacting with the symbolic graph. A symbolic graph is compiled into a highly efficient execution code. It achieves this by restructuring mathematical equations to make them faster. It compiles some parts of the expression into C language code. It moves some tensors to the GPU, and so on. The efficient compiled code is now given as an input to the Theano function. When you execute the Theano function, it assigns the result of computation to the variables specified by us. The type of optimization may be specified as FAST_COMPILE or FAST_RUN. This is specified in the environment variable THEANO_FLAGS. A Theano function is declared using the following syntax − f = theano.function ([x], y) The first parameter [x] is the list of input variables and the second parameter y is the list of output variables. Having now understood the basics of Theano, let us begin Theano coding with a trivial example.
Theano – Useful Resources The following resources contain additional information on theano. Please use them to get more in-depth knowledge on this. Useful Links on Theano − Official Website of Theano. − Wikipedia Reference for theano. To enlist your site on this page, please drop an email to [email protected]