Discuss SciPy ”; Previous Next SciPy, a scientific library for Python is an open source, BSD-licensed library for mathematics, science and engineering. The SciPy library depends on NumPy, which provides convenient and fast N-dimensional array manipulation. The main reason for building the SciPy library is that, it should work with NumPy arrays. It provides many user-friendly and efficient numerical practices such as routines for numerical integration and optimization. This is an introductory tutorial, which covers the fundamentals of SciPy and describes how to deal with its various modules. Print Page Previous Next Advertisements ”;
Category: scipy
SciPy – Special Package
SciPy – Special Package ”; Previous Next The functions available in the special package are universal functions, which follow broadcasting and automatic array looping. Let us look at some of the most frequently used special functions − Cubic Root Function Exponential Function Relative Error Exponential Function Log Sum Exponential Function Lambert Function Permutations and Combinations Function Gamma Function Let us now understand each of these functions in brief. Cubic Root Function The syntax of this cubic root function is – scipy.special.cbrt(x). This will fetch the element-wise cube root of x. Let us consider the following example. from scipy.special import cbrt res = cbrt([10, 9, 0.1254, 234]) print res The above program will generate the following output. [ 2.15443469 2.08008382 0.50053277 6.16224015] Exponential Function The syntax of the exponential function is – scipy.special.exp10(x). This will compute 10**x element wise. Let us consider the following example. from scipy.special import exp10 res = exp10([2, 9]) print res The above program will generate the following output. [1.00000000e+02 1.00000000e+09] Relative Error Exponential Function The syntax for this function is – scipy.special.exprel(x). It generates the relative error exponential, (exp(x) – 1)/x. When x is near zero, exp(x) is near 1, so the numerical calculation of exp(x) – 1 can suffer from catastrophic loss of precision. Then exprel(x) is implemented to avoid the loss of precision, which occurs when x is near zero. Let us consider the following example. from scipy.special import exprel res = exprel([-0.25, -0.1, 0, 0.1, 0.25]) print res The above program will generate the following output. [0.88479687 0.95162582 1. 1.05170918 1.13610167] Log Sum Exponential Function The syntax for this function is – scipy.special.logsumexp(x). It helps to compute the log of the sum of exponentials of input elements. Let us consider the following example. from scipy.special import logsumexp import numpy as np a = np.arange(10) res = logsumexp(a) print res The above program will generate the following output. 9.45862974443 Lambert Function The syntax for this function is – scipy.special.lambertw(x). It is also called as the Lambert W function. The Lambert W function W(z) is defined as the inverse function of w * exp(w). In other words, the value of W(z) is such that z = W(z) * exp(W(z)) for any complex number z. The Lambert W function is a multivalued function with infinitely many branches. Each branch gives a separate solution of the equation z = w exp(w). Here, the branches are indexed by the integer k. Let us consider the following example. Here, the Lambert W function is the inverse of w exp(w). from scipy.special import lambertw w = lambertw(1) print w print w * np.exp(w) The above program will generate the following output. (0.56714329041+0j) (1+0j) Permutations & Combinations Let us discuss permutations and combinations separately for understanding them clearly. Combinations − The syntax for combinations function is – scipy.special.comb(N,k). Let us consider the following example − from scipy.special import comb res = comb(10, 3, exact = False,repetition=True) print res The above program will generate the following output. 220.0 Note − Array arguments are accepted only for exact = False case. If k > N, N < 0, or k < 0, then a 0 is returned. Permutations − The syntax for combinations function is – scipy.special.perm(N,k). Permutations of N things taken k at a time, i.e., k-permutations of N. This is also known as “partial permutations”. Let us consider the following example. from scipy.special import perm res = perm(10, 3, exact = True) print res The above program will generate the following output. 720 Gamma Function The gamma function is often referred to as the generalized factorial since z*gamma(z) = gamma(z+1) and gamma(n+1) = n!, for a natural number ‘n’. The syntax for combinations function is – scipy.special.gamma(x). Permutations of N things taken k at a time, i.e., k-permutations of N. This is also known as “partial permutations”. The syntax for combinations function is – scipy.special.gamma(x). Permutations of N things taken k at a time, i.e., k-permutations of N. This is also known as “partial permutations”. from scipy.special import gamma res = gamma([0, 0.5, 1, 5]) print res The above program will generate the following output. [inf 1.77245385 1. 24.] Print Page Previous Next Advertisements ”;
SciPy – Useful Resources
SciPy – Useful Resources ”; Previous Next The following resources contain additional information on SciPy. Please use them to get more in-depth knowledge on this topic. Useful Video Courses Python Flask and SQLAlchemy ORM 22 Lectures 1.5 hours Jack Chan More Detail Python and Elixir Programming Bundle Course 81 Lectures 9.5 hours Pranjal Srivastava More Detail TKinter Course – Build Python GUI Apps 49 Lectures 4 hours John Elder More Detail A Beginner”s Guide to Python and Data Science 81 Lectures 8.5 hours Datai Team Academy More Detail Deploy Face Recognition Project With Python, Django, And Machine Learning Best Seller 93 Lectures 6.5 hours Srikanth Guskra More Detail Professional Python Web Development with Flask 80 Lectures 12 hours Stone River ELearning More Detail Print Page Previous Next Advertisements ”;
SciPy – ODR
SciPy – ODR ”; Previous Next ODR stands for Orthogonal Distance Regression, which is used in the regression studies. Basic linear regression is often used to estimate the relationship between the two variables y and x by drawing the line of best fit on the graph. The mathematical method that is used for this is known as Least Squares, and aims to minimize the sum of the squared error for each point. The key question here is how do you calculate the error (also known as the residual) for each point? In a standard linear regression, the aim is to predict the Y value from the X value – so the sensible thing to do is to calculate the error in the Y values (shown as the gray lines in the following image). However, sometimes it is more sensible to take into account the error in both X and Y (as shown by the dotted red lines in the following image). For example − When you know your measurements of X are uncertain, or when you do not want to focus on the errors of one variable over another. Orthogonal Distance Regression (ODR) is a method that can do this (orthogonal in this context means perpendicular – so it calculates errors perpendicular to the line, rather than just ‘vertically’). scipy.odr Implementation for Univariate Regression The following example demonstrates scipy.odr implementation for univariate regression. import numpy as np import matplotlib.pyplot as plt from scipy.odr import * import random # Initiate some data, giving some randomness using random.random(). x = np.array([0, 1, 2, 3, 4, 5]) y = np.array([i**2 + random.random() for i in x]) # Define a function (quadratic in our case) to fit the data with. def linear_func(p, x): m, c = p return m*x + c # Create a model for fitting. linear_model = Model(linear_func) # Create a RealData object using our initiated data from above. data = RealData(x, y) # Set up ODR with the model and data. odr = ODR(data, linear_model, beta0=[0., 1.]) # Run the regression. out = odr.run() # Use the in-built pprint method to give us results. out.pprint() The above program will generate the following output. Beta: [ 5.51846098 -4.25744878] Beta Std Error: [ 0.7786442 2.33126407] Beta Covariance: [ [ 1.93150969 -4.82877433] [ -4.82877433 17.31417201 ]] Residual Variance: 0.313892697582 Inverse Condition #: 0.146618499389 Reason(s) for Halting: Sum of squares convergence Print Page Previous Next Advertisements ”;
SciPy – Quick Guide
SciPy – Quick Guide ”; Previous Next SciPy – Introduction SciPy, pronounced as Sigh Pi, is a scientific python open source, distributed under the BSD licensed library to perform Mathematical, Scientific and Engineering Computations. The SciPy library depends on NumPy, which provides convenient and fast N-dimensional array manipulation. The SciPy library is built to work with NumPy arrays and provides many user-friendly and efficient numerical practices such as routines for numerical integration and optimization. Together, they run on all popular operating systems, are quick to install and are free of charge. NumPy and SciPy are easy to use, but powerful enough to depend on by some of the world”s leading scientists and engineers. SciPy Sub-packages SciPy is organized into sub-packages covering different scientific computing domains. These are summarized in the following table − scipy.cluster Vector quantization / Kmeans scipy.constants Physical and mathematical constants scipy.fftpack Fourier transform scipy.integrate Integration routines scipy.interpolate Interpolation scipy.io Data input and output scipy.linalg Linear algebra routines scipy.ndimage n-dimensional image package scipy.odr Orthogonal distance regression scipy.optimize Optimization scipy.signal Signal processing scipy.sparse Sparse matrices scipy.spatial Spatial data structures and algorithms scipy.special Any special mathematical functions scipy.stats Statistics Data Structure The basic data structure used by SciPy is a multidimensional array provided by the NumPy module. NumPy provides some functions for Linear Algebra, Fourier Transforms and Random Number Generation, but not with the generality of the equivalent functions in SciPy. SciPy – Environment Setup Standard Python distribution does not come bundled with any SciPy module. A lightweight alternative is to install SciPy using the popular Python package installer, pip install pandas If we install the Anaconda Python package, Pandas will be installed by default. Following are the packages and links to install them in different operating systems. Windows Anaconda (from https://www.continuum.io) is a free Python distribution for the SciPy stack. It is also available for Linux and Mac. Canopy (https://www.enthought.com/products/canopy/) is available free, as well as for commercial distribution with a full SciPy stack for Windows, Linux and Mac. Python (x,y) − It is a free Python distribution with SciPy stack and Spyder IDE for Windows OS. (Downloadable from https://python-xy.github.io/) Linux Package managers of respective Linux distributions are used to install one or more packages in the SciPy stack. Ubuntu We can use the following path to install Python in Ubuntu. sudo apt-get install python-numpy python-scipy python-matplotlibipythonipython-notebook python-pandas python-sympy python-nose Fedora We can use the following path to install Python in Fedora. sudo yum install numpyscipy python-matplotlibipython python-pandas sympy python-nose atlas-devel SciPy – Basic Functionality By default, all the NumPy functions have been available through the SciPy namespace. There is no need to import the NumPy functions explicitly, when SciPy is imported. The main object of NumPy is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. In NumPy, dimensions are called as axes. The number of axes is called as rank. Now, let us revise the basic functionality of Vectors and Matrices in NumPy. As SciPy is built on top of NumPy arrays, understanding of NumPy basics is necessary. As most parts of linear algebra deals with matrices only. NumPy Vector A Vector can be created in multiple ways. Some of them are described below. Converting Python array-like objects to NumPy Let us consider the following example. import numpy as np list = [1,2,3,4] arr = np.array(list) print arr The output of the above program will be as follows. [1 2 3 4] Intrinsic NumPy Array Creation NumPy has built-in functions for creating arrays from scratch. Some of these functions are explained below. Using zeros() The zeros(shape) function will create an array filled with 0 values with the specified shape. The default dtype is float64. Let us consider the following example. import numpy as np print np.zeros((2, 3)) The output of the above program will be as follows. array([[ 0., 0., 0.], [ 0., 0., 0.]]) Using ones() The ones(shape) function will create an array filled with 1 values. It is identical to zeros in all the other respects. Let us consider the following example. import numpy as np print np.ones((2, 3)) The output of the above program will be as follows. array([[ 1., 1., 1.], [ 1., 1., 1.]]) Using arange() The arange() function will create arrays with regularly incrementing values. Let us consider the following example. import numpy as np print np.arange(7) The above program will generate the following output. array([0, 1, 2, 3, 4, 5, 6]) Defining the data type of the values Let us consider the following example. import numpy as np arr = np.arange(2, 10, dtype = np.float) print arr print “Array Data Type :”,arr.dtype The above program will generate the following output. [ 2. 3. 4. 5. 6. 7. 8. 9.] Array Data Type : float64 Using linspace() The linspace() function will create arrays with a specified number of elements, which will be spaced equally between the specified beginning and end values. Let us consider the following example. import numpy as np print np.linspace(1., 4., 6) The above program will generate the following output. array([ 1. , 1.6, 2.2, 2.8, 3.4, 4. ]) Matrix A matrix is a specialized 2-D array that retains its 2-D nature through operations. It has certain special operators, such as * (matrix multiplication) and ** (matrix power). Let us consider the following example. import numpy as np print np.matrix(”1 2; 3 4”) The above program will generate the following output. matrix([[1, 2], [3, 4]]) Conjugate Transpose of Matrix This feature returns the (complex) conjugate transpose of self. Let us consider the following example. import numpy as np mat = np.matrix(”1 2; 3 4”) print mat.H The above program will generate the following output. matrix([[1, 3], [2, 4]]) Transpose of Matrix This feature returns the transpose of self. Let us consider the following example. import numpy as np mat = np.matrix(”1 2; 3 4”) mat.T The above program will generate the following output. matrix([[1, 3], [2, 4]]) When we
SciPy – Interpolate
SciPy – Interpolate ”; Previous Next In this chapter, we will discuss how interpolation helps in SciPy. What is Interpolation? Interpolation is the process of finding a value between two points on a line or a curve. To help us remember what it means, we should think of the first part of the word, ”inter,” as meaning ”enter,” which reminds us to look ”inside” the data we originally had. This tool, interpolation, is not only useful in statistics, but is also useful in science, business, or when there is a need to predict values that fall within two existing data points. Let us create some data and see how this interpolation can be done using the scipy.interpolate package. import numpy as np from scipy import interpolate import matplotlib.pyplot as plt x = np.linspace(0, 4, 12) y = np.cos(x**2/3+4) print x,y The above program will generate the following output. ( array([0., 0.36363636, 0.72727273, 1.09090909, 1.45454545, 1.81818182, 2.18181818, 2.54545455, 2.90909091, 3.27272727, 3.63636364, 4.]), array([-0.65364362, -0.61966189, -0.51077021, -0.31047698, -0.00715476, 0.37976236, 0.76715099, 0.99239518, 0.85886263, 0.27994201, -0.52586509, -0.99582185]) ) Now, we have two arrays. Assuming those two arrays as the two dimensions of the points in space, let us plot using the following program and see how they look like. plt.plot(x, y,’o’) plt.show() The above program will generate the following output. 1-D Interpolation The interp1d class in the scipy.interpolate is a convenient method to create a function based on fixed data points, which can be evaluated anywhere within the domain defined by the given data using linear interpolation. By using the above data, let us create a interpolate function and draw a new interpolated graph. f1 = interp1d(x, y,kind = ”linear”) f2 = interp1d(x, y, kind = ”cubic”) Using the interp1d function, we created two functions f1 and f2. These functions, for a given input x returns y. The third variable kind represents the type of the interpolation technique. ”Linear”, ”Nearest”, ”Zero”, ”Slinear”, ”Quadratic”, ”Cubic” are a few techniques of interpolation. Now, let us create a new input of more length to see the clear difference of interpolation. We will use the same function of the old data on the new data. xnew = np.linspace(0, 4,30) plt.plot(x, y, ”o”, xnew, f(xnew), ”-”, xnew, f2(xnew), ”–”) plt.legend([”data”, ”linear”, ”cubic”,”nearest”], loc = ”best”) plt.show() The above program will generate the following output. Splines To draw smooth curves through data points, drafters once used thin flexible strips of wood, hard rubber, metal or plastic called mechanical splines. To use a mechanical spline, pins were placed at a judicious selection of points along a curve in a design, and then the spline was bent, so that it touched each of these pins. Clearly, with this construction, the spline interpolates the curve at these pins. It can be used to reproduce the curve in other drawings. The points where the pins are located is called knots. We can change the shape of the curve defined by the spline by adjusting the location of the knots. Univariate Spline One-dimensional smoothing spline fits a given set of data points. The UnivariateSpline class in scipy.interpolate is a convenient method to create a function, based on fixed data points class – scipy.interpolate.UnivariateSpline(x, y, w = None, bbox = [None, None], k = 3, s = None, ext = 0, check_finite = False). Parameters − Following are the parameters of a Univariate Spline. This fits a spline y = spl(x) of degree k to the provided x, y data. ‘w’ − Specifies the weights for spline fitting. Must be positive. If none (default), weights are all equal. ‘s’ − Specifies the number of knots by specifying a smoothing condition. ‘k’ − Degree of the smoothing spline. Must be <= 5. Default is k = 3, a cubic spline. Ext − Controls the extrapolation mode for elements not in the interval defined by the knot sequence. if ext = 0 or ‘extrapolate’, returns the extrapolated value. if ext = 1 or ‘zero’, returns 0 if ext = 2 or ‘raise’, raises a ValueError if ext = 3 of ‘const’, returns the boundary value. check_finite – Whether to check that the input arrays contain only finite numbers. Let us consider the following example. import matplotlib.pyplot as plt from scipy.interpolate import UnivariateSpline x = np.linspace(-3, 3, 50) y = np.exp(-x**2) + 0.1 * np.random.randn(50) plt.plot(x, y, ”ro”, ms = 5) plt.show() Use the default value for the smoothing parameter. spl = UnivariateSpline(x, y) xs = np.linspace(-3, 3, 1000) plt.plot(xs, spl(xs), ”g”, lw = 3) plt.show() Manually change the amount of smoothing. spl.set_smoothing_factor(0.5) plt.plot(xs, spl(xs), ”b”, lw = 3) plt.show() Print Page Previous Next Advertisements ”;
SciPy – Optimize
SciPy – Optimize ”; Previous Next The scipy.optimize package provides several commonly used optimization algorithms. This module contains the following aspects − Unconstrained and constrained minimization of multivariate scalar functions (minimize()) using a variety of algorithms (e.g. BFGS, Nelder-Mead simplex, Newton Conjugate Gradient, COBYLA or SLSQP) Global (brute-force) optimization routines (e.g., anneal(), basinhopping()) Least-squares minimization (leastsq()) and curve fitting (curve_fit()) algorithms Scalar univariate functions minimizers (minimize_scalar()) and root finders (newton()) Multivariate equation system solvers (root()) using a variety of algorithms (e.g. hybrid Powell, Levenberg-Marquardt or large-scale methods such as Newton-Krylov) Unconstrained & Constrained minimization of multivariate scalar functions The minimize() function provides a common interface to unconstrained and constrained minimization algorithms for multivariate scalar functions in scipy.optimize. To demonstrate the minimization function, consider the problem of minimizing the Rosenbrock function of the NN variables − $$f(x) = sum_{i = 1}^{N-1} :100(x_i – x_{i-1}^{2})$$ The minimum value of this function is 0, which is achieved when xi = 1. Nelder–Mead Simplex Algorithm In the following example, the minimize() routine is used with the Nelder-Mead simplex algorithm (method = ”Nelder-Mead”) (selected through the method parameter). Let us consider the following example. import numpy as np from scipy.optimize import minimize def rosen(x): x0 = np.array([1.3, 0.7, 0.8, 1.9, 1.2]) res = minimize(rosen, x0, method=”nelder-mead”) print(res.x) The above program will generate the following output. [7.93700741e+54 -5.41692163e+53 6.28769150e+53 1.38050484e+55 -4.14751333e+54] The simplex algorithm is probably the simplest way to minimize a fairly well-behaved function. It requires only function evaluations and is a good choice for simple minimization problems. However, because it does not use any gradient evaluations, it may take longer to find the minimum. Another optimization algorithm that needs only function calls to find the minimum is the Powell‘s method, which is available by setting method = ”powell” in the minimize() function. Least Squares Solve a nonlinear least-squares problem with bounds on the variables. Given the residuals f(x) (an m-dimensional real function of n real variables) and the loss function rho(s) (a scalar function), least_squares find a local minimum of the cost function F(x). Let us consider the following example. In this example, we find a minimum of the Rosenbrock function without bounds on the independent variables. #Rosenbrock Function def fun_rosenbrock(x): return np.array([10 * (x[1] – x[0]**2), (1 – x[0])]) from scipy.optimize import least_squares input = np.array([2, 2]) res = least_squares(fun_rosenbrock, input) print res Notice that, we only provide the vector of the residuals. The algorithm constructs the cost function as a sum of squares of the residuals, which gives the Rosenbrock function. The exact minimum is at x = [1.0,1.0]. The above program will generate the following output. active_mask: array([ 0., 0.]) cost: 9.8669242910846867e-30 fun: array([ 4.44089210e-15, 1.11022302e-16]) grad: array([ -8.89288649e-14, 4.44089210e-14]) jac: array([[-20.00000015,10.],[ -1.,0.]]) message: ”`gtol` termination condition is satisfied.” nfev: 3 njev: 3 optimality: 8.8928864934219529e-14 status: 1 success: True x: array([ 1., 1.]) Root finding Let us understand how root finding helps in SciPy. Scalar functions If one has a single-variable equation, there are four different root-finding algorithms, which can be tried. Each of these algorithms require the endpoints of an interval in which a root is expected (because the function changes signs). In general, brentq is the best choice, but the other methods may be useful in certain circumstances or for academic purposes. Fixed-point solving A problem closely related to finding the zeros of a function is the problem of finding a fixed point of a function. A fixed point of a function is the point at which evaluation of the function returns the point: g(x) = x. Clearly the fixed point of gg is the root of f(x) = g(x)−x. Equivalently, the root of ff is the fixed_point of g(x) = f(x)+x. The routine fixed_point provides a simple iterative method using the Aitkens sequence acceleration to estimate the fixed point of gg, if a starting point is given. Sets of equations Finding a root of a set of non-linear equations can be achieved using the root() function. Several methods are available, amongst which hybr (the default) and lm, respectively use the hybrid method of Powell and the Levenberg-Marquardt method from the MINPACK. The following example considers the single-variable transcendental equation. x2 + 2cos(x) = 0 A root of which can be found as follows − import numpy as np from scipy.optimize import root def func(x): return x*2 + 2 * np.cos(x) sol = root(func, 0.3) print sol The above program will generate the following output. fjac: array([[-1.]]) fun: array([ 2.22044605e-16]) message: ”The solution converged.” nfev: 10 qtf: array([ -2.77644574e-12]) r: array([-3.34722409]) status: 1 success: True x: array([-0.73908513]) Print Page Previous Next Advertisements ”;
SciPy – Spatial
SciPy – Spatial ”; Previous Next The scipy.spatial package can compute Triangulations, Voronoi Diagrams and Convex Hulls of a set of points, by leveraging the Qhull library. Moreover, it contains KDTree implementations for nearest-neighbor point queries and utilities for distance computations in various metrics. Delaunay Triangulations Let us understand what Delaunay Triangulations are and how they are used in SciPy. What are Delaunay Triangulations? In mathematics and computational geometry, a Delaunay triangulation for a given set P of discrete points in a plane is a triangulation DT(P) such that no point in P is inside the circumcircle of any triangle in DT(P). We can the compute the same through SciPy. Let us consider the following example. from scipy.spatial import Delaunay points = np.array([[0, 4], [2, 1.1], [1, 3], [1, 2]]) tri = Delaunay(points) import matplotlib.pyplot as plt plt.triplot(points[:,0], points[:,1], tri.simplices.copy()) plt.plot(points[:,0], points[:,1], ”o”) plt.show() The above program will generate the following output. Coplanar Points Let us understand what Coplanar Points are and how they are used in SciPy. What are Coplanar Points? Coplanar points are three or more points that lie in the same plane. Recall that a plane is a flat surface, which extends without end in all directions. It is usually shown in math textbooks as a four-sided figure. Let us see how we can find this using SciPy. Let us consider the following example. from scipy.spatial import Delaunay points = np.array([[0, 0], [0, 1], [1, 0], [1, 1], [1, 1]]) tri = Delaunay(points) print tri.coplanar The above program will generate the following output. array([[4, 0, 3]], dtype = int32) This means that point 4 resides near triangle 0 and vertex 3, but is not included in the triangulation. Convex hulls Let us understand what convex hulls are and how they are used in SciPy. What are Convex Hulls? In mathematics, the convex hull or convex envelope of a set of points X in the Euclidean plane or in a Euclidean space (or, more generally, in an affine space over the reals) is the smallest convex set that contains X. Let us consider the following example to understand it in detail. from scipy.spatial import ConvexHull points = np.random.rand(10, 2) # 30 random points in 2-D hull = ConvexHull(points) import matplotlib.pyplot as plt plt.plot(points[:,0], points[:,1], ”o”) for simplex in hull.simplices: plt.plot(points[simplex,0], points[simplex,1], ”k-”) plt.show() The above program will generate the following output. Print Page Previous Next Advertisements ”;
SciPy – Ndimage
SciPy – Ndimage ”; Previous Next The SciPy ndimage submodule is dedicated to image processing. Here, ndimage means an n-dimensional image. Some of the most common tasks in image processing are as follows &miuns; Input/Output, displaying images Basic manipulations − Cropping, flipping, rotating, etc. Image filtering − De-noising, sharpening, etc. Image segmentation − Labeling pixels corresponding to different objects Classification Feature extraction Registration Let us discuss how some of these can be achieved using SciPy. Opening and Writing to Image Files The misc package in SciPy comes with some images. We use those images to learn the image manipulations. Let us consider the following example. from scipy import misc f = misc.face() misc.imsave(”face.png”, f) # uses the Image module (PIL) import matplotlib.pyplot as plt plt.imshow(f) plt.show() The above program will generate the following output. Any images in its raw format is the combination of colors represented by the numbers in the matrix format. A machine understands and manipulates the images based on those numbers only. RGB is a popular way of representation. Let us see the statistical information of the above image. from scipy import misc face = misc.face(gray = False) print face.mean(), face.max(), face.min() The above program will generate the following output. 110.16274388631184, 255, 0 Now, we know that the image is made out of numbers, so any change in the value of the number alters the original image. Let us perform some geometric transformations on the image. The basic geometric operation is cropping from scipy import misc face = misc.face(gray = True) lx, ly = face.shape # Cropping crop_face = face[lx / 4: – lx / 4, ly / 4: – ly / 4] import matplotlib.pyplot as plt plt.imshow(crop_face) plt.show() The above program will generate the following output. We can also perform some basic operations such as turning the image upside down as described below. # up <-> down flip from scipy import misc face = misc.face() flip_ud_face = np.flipud(face) import matplotlib.pyplot as plt plt.imshow(flip_ud_face) plt.show() The above program will generate the following output. Besides this, we have the rotate() function, which rotates the image with a specified angle. # rotation from scipy import misc,ndimage face = misc.face() rotate_face = ndimage.rotate(face, 45) import matplotlib.pyplot as plt plt.imshow(rotate_face) plt.show() The above program will generate the following output. Filters Let us discuss how filters help in image processing. What is filtering in image processing? Filtering is a technique for modifying or enhancing an image. For example, you can filter an image to emphasize certain features or remove other features. Image processing operations implemented with filtering include Smoothing, Sharpening, and Edge Enhancement. Filtering is a neighborhood operation, in which the value of any given pixel in the output image is determined by applying some algorithm to the values of the pixels in the neighborhood of the corresponding input pixel. Let us now perform a few operations using SciPy ndimage. Blurring Blurring is widely used to reduce the noise in the image. We can perform a filter operation and see the change in the image. Let us consider the following example. from scipy import misc face = misc.face() blurred_face = ndimage.gaussian_filter(face, sigma=3) import matplotlib.pyplot as plt plt.imshow(blurred_face) plt.show() The above program will generate the following output. The sigma value indicates the level of blur on a scale of five. We can see the change on the image quality by tuning the sigma value. For more details of blurring, click on → DIP (Digital Image Processing) Tutorial. Edge Detection Let us discuss how edge detection helps in image processing. What is Edge Detection? Edge detection is an image processing technique for finding the boundaries of objects within images. It works by detecting discontinuities in brightness. Edge detection is used for image segmentation and data extraction in areas such as Image Processing, Computer Vision and Machine Vision. The most commonly used edge detection algorithms include Sobel Canny Prewitt Roberts Fuzzy Logic methods Let us consider the following example. import scipy.ndimage as nd import numpy as np im = np.zeros((256, 256)) im[64:-64, 64:-64] = 1 im[90:-90,90:-90] = 2 im = ndimage.gaussian_filter(im, 8) import matplotlib.pyplot as plt plt.imshow(im) plt.show() The above program will generate the following output. The image looks like a square block of colors. Now, we will detect the edges of those colored blocks. Here, ndimage provides a function called Sobel to carry out this operation. Whereas, NumPy provides the Hypot function to combine the two resultant matrices to one. Let us consider the following example. import scipy.ndimage as nd import matplotlib.pyplot as plt im = np.zeros((256, 256)) im[64:-64, 64:-64] = 1 im[90:-90,90:-90] = 2 im = ndimage.gaussian_filter(im, 8) sx = ndimage.sobel(im, axis = 0, mode = ”constant”) sy = ndimage.sobel(im, axis = 1, mode = ”constant”) sob = np.hypot(sx, sy) plt.imshow(sob) plt.show() The above program will generate the following output. Print Page Previous Next Advertisements ”;
SciPy – Stats
SciPy – Stats ”; Previous Next All of the statistics functions are located in the sub-package scipy.stats and a fairly complete listing of these functions can be obtained using info(stats) function. A list of random variables available can also be obtained from the docstring for the stats sub-package. This module contains a large number of probability distributions as well as a growing library of statistical functions. Each univariate distribution has its own subclass as described in the following table − Sr. No. Class & Description 1 rv_continuous A generic continuous random variable class meant for subclassing 2 rv_discrete A generic discrete random variable class meant for subclassing 3 rv_histogram Generates a distribution given by a histogram Normal Continuous Random Variable A probability distribution in which the random variable X can take any value is continuous random variable. The location (loc) keyword specifies the mean. The scale (scale) keyword specifies the standard deviation. As an instance of the rv_continuous class, norm object inherits from it a collection of generic methods and completes them with details specific for this particular distribution. To compute the CDF at a number of points, we can pass a list or a NumPy array. Let us consider the following example. from scipy.stats import norm import numpy as np print norm.cdf(np.array([1,-1., 0, 1, 3, 4, -2, 6])) The above program will generate the following output. array([ 0.84134475, 0.15865525, 0.5 , 0.84134475, 0.9986501 , 0.99996833, 0.02275013, 1. ]) To find the median of a distribution, we can use the Percent Point Function (PPF), which is the inverse of the CDF. Let us understand by using the following example. from scipy.stats import norm print norm.ppf(0.5) The above program will generate the following output. 0.0 To generate a sequence of random variates, we should use the size keyword argument, which is shown in the following example. from scipy.stats import norm print norm.rvs(size = 5) The above program will generate the following output. array([ 0.20929928, -1.91049255, 0.41264672, -0.7135557 , -0.03833048]) The above output is not reproducible. To generate the same random numbers, use the seed function. Uniform Distribution A uniform distribution can be generated using the uniform function. Let us consider the following example. from scipy.stats import uniform print uniform.cdf([0, 1, 2, 3, 4, 5], loc = 1, scale = 4) The above program will generate the following output. array([ 0. , 0. , 0.25, 0.5 , 0.75, 1. ]) Build Discrete Distribution Let us generate a random sample and compare the observed frequencies with the probabilities. Binomial Distribution As an instance of the rv_discrete class, the binom object inherits from it a collection of generic methods and completes them with details specific for this particular distribution. Let us consider the following example. from scipy.stats import uniform print uniform.cdf([0, 1, 2, 3, 4, 5], loc = 1, scale = 4) The above program will generate the following output. array([ 0. , 0. , 0.25, 0.5 , 0.75, 1. ]) Descriptive Statistics The basic stats such as Min, Max, Mean and Variance takes the NumPy array as input and returns the respective results. A few basic statistical functions available in the scipy.stats package are described in the following table. Sr. No. Function & Description 1 describe() Computes several descriptive statistics of the passed array 2 gmean() Computes geometric mean along the specified axis 3 hmean() Calculates the harmonic mean along the specified axis 4 kurtosis() Computes the kurtosis 5 mode() Returns the modal value 6 skew() Tests the skewness of the data 7 f_oneway() Performs a 1-way ANOVA 8 iqr() Computes the interquartile range of the data along the specified axis 9 zscore() Calculates the z score of each value in the sample, relative to the sample mean and standard deviation 10 sem() Calculates the standard error of the mean (or standard error of measurement) of the values in the input array Several of these functions have a similar version in the scipy.stats.mstats, which work for masked arrays. Let us understand this with the example given below. from scipy import stats import numpy as np x = np.array([1,2,3,4,5,6,7,8,9]) print x.max(),x.min(),x.mean(),x.var() The above program will generate the following output. (9, 1, 5.0, 6.666666666666667) T-test Let us understand how T-test is useful in SciPy. ttest_1samp Calculates the T-test for the mean of ONE group of scores. This is a two-sided test for the null hypothesis that the expected value (mean) of a sample of independent observations ‘a’ is equal to the given population mean, popmean. Let us consider the following example. from scipy import stats rvs = stats.norm.rvs(loc = 5, scale = 10, size = (50,2)) print stats.ttest_1samp(rvs,5.0) The above program will generate the following output. Ttest_1sampResult(statistic = array([-1.40184894, 2.70158009]), pvalue = array([ 0.16726344, 0.00945234])) Comparing two samples In the following examples, there are two samples, which can come either from the same or from different distribution, and we want to test whether these samples have the same statistical properties. ttest_ind − Calculates the T-test for the means of two independent samples of scores. This is a two-sided test for the null hypothesis that two independent samples have identical average (expected) values. This test assumes that the populations have identical variances by default. We can use this test, if we observe two independent samples from the same or different population. Let us consider the following example. from scipy import stats rvs1 = stats.norm.rvs(loc = 5,scale = 10,size = 500) rvs2 = stats.norm.rvs(loc = 5,scale = 10,size = 500) print stats.ttest_ind(rvs1,rvs2) The above program will generate the following output. Ttest_indResult(statistic = -0.67406312233650278, pvalue = 0.50042727502272966) You can test the same with a new array of the same length, but with a varied mean. Use a different value in loc and test the same. Print Page Previous Next Advertisements ”;