OpenCV Python – Face Detection ”; Previous Next OpenCV uses Haar feature-based cascade classifiers for the object detection. It is a machine learning based algorithm, where a cascade function is trained from a lot of positive and negative images. It is then used to detect objects in other images. The algorithm uses the concept of Cascade of Classifiers. Pretrained classifiers for face, eye etc. can be downloaded from https://github.com For the following example, download and copy haarcascade_frontalface_default.xml and haarcascade_eye.xml from this URL. Then, load our input image to be used for face detection in grayscale mode. The DetectMultiScale() method of CascadeClassifier class detects objects in the input image. It returns the positions of detected faces as in the form of Rectangle and its dimensions (x,y,w,h). Once we get these locations, we can use it for eye detection since eyes are always on the face! Example The complete code for face detection is as follows − import numpy as np import cv2 face_cascade = cv2.CascadeClassifier(”haarcascade_frontalface_default.xml”) eye_cascade = cv2.CascadeClassifier(”haarcascade_eye.xml”) img = cv2.imread(”Dhoni-and-virat.jpg”) gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) faces = face_cascade.detectMultiScale(gray, 1.3, 5) for (x,y,w,h) in faces: img = cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2) roi_gray = gray[y:y+h, x:x+w] roi_color = img[y:y+h, x:x+w] eyes = eye_cascade.detectMultiScale(roi_gray) for (ex,ey,ew,eh) in eyes: cv2.rectangle(roi_color,(ex,ey),(ex+ew,ey+eh),(0,255,0),2) cv2.imshow(”img”,img) cv2.waitKey(0) cv2.destroyAllWindows() Output You will get rectangles drawn around faces in the input image as shown below − Print Page Previous Next Advertisements ”;
Category: opencv Python
OpenCV Python – Image Pyramids ”; Previous Next Occasionally, we may need to convert an image to a size different than its original. For this, you either Upsize the image (zoom in) or Downsize it (zoom out). An image pyramid is a collection of images (constructed from a single original image) successively down sampled a specified number of times. The Gaussian pyramid is used to down sample images while the Laplacian pyramid reconstructs an up sampled image from an image lower in the pyramid with less resolution. Consider the pyramid as a set of layers. The image is shown below − Image at the higher layer of the pyramid is smaller in size. To produce an image at the next layer in the Gaussian pyramid, we convolve a lower level image with a Gaussian kernel. $$frac{1}{16}begin{bmatrix}1 & 4 & 6 & 4 & 1 \4 & 16 & 24 & 16 & 4 \6 & 24 & 36 & 24 & 6 \4 & 16 & 24 & 16 & 4 \1 & 4 & 6 & 4 & 1end{bmatrix}$$ Now remove every even-numbered row and column. Resulting image will be 1/4th the area of its predecessor. Iterating this process on the original image produces the entire pyramid. To make the images bigger, the columns filled with zeros. First, upsize the image to double the original in each dimension, with the new even rows and then perform a convolution with the kernel to approximate the values of the missing pixels. The cv.pyrUp() function doubles the original size and cv.pyrDown() function decreases it to half. Example Following program calls pyrUp() and pyrDown() functions depending on user input “I” or “o” respectively. Note that when we reduce the size of an image, information of the image is lost. Once, we scale down and if we rescale it to the original size, we lose some information and the resolution of the new image is much lower than the original one. import sys import cv2 as cv filename = ”chicky_512.png” src = cv.imread(filename) while 1: print (“press ”i” for zoom in ”o” for zoom out esc to stop”) rows, cols, _channels = map(int, src.shape) cv.imshow(”Pyramids”, src) k = cv.waitKey(0) if k == 27: break elif chr(k) == ”i”: src = cv.pyrUp(src, dstsize=(2 * cols, 2 * rows)) elif chr(k) == ”o”: src = cv.pyrDown(src, dstsize=(cols // 2, rows // 2)) cv.destroyAllWindows() Output Print Page Previous Next Advertisements ”;
OpenCV Python – Image Blending with Pyramids ”; Previous Next The discontinuity of images can be minimised by the use of image pyramids. This results in a seamless blended image. Following steps are taken to achieve the final result − First load the images and find Gaussian pyramids for both. The program for the same is as follows − import cv2 import numpy as np,sys kalam = cv2.imread(”kalam.jpg”) einst = cv2.imread(”einstein.jpg”) ### generate Gaussian pyramid for first G = kalam.copy() gpk = [G] for i in range(6): G = cv2.pyrDown(G) gpk.append(G) # generate Gaussian pyramid for second G = einst.copy() gpe = [G] for i in range(6): G = cv2.pyrDown(G) gpe.append(G) From the Gaussian pyramids, obtain the respective Laplacian Pyramids. The program for the same is as follows − # generate Laplacian Pyramid for first lpk = [gpk[5]] for i in range(5,0,-1): GE = cv2.pyrUp(gpk[i]) L = cv2.subtract(gpk[i-1],GE) lpk.append(L) # generate Laplacian Pyramid for second lpe = [gpe[5]] for i in range(5,0,-1): GE = cv2.pyrUp(gpe[i]) L = cv2.subtract(gpe[i-1],GE) lpe.append(L) Then, join the left half of the first image with the right half of second in each level of pyramids. The program for the same is as follows − # Now add left and right halves of images in each level LS = [] for la,lb in zip(lpk,lpe): rows,cols,dpt = la.shape ls = np.hstack((la[:,0:int(cols/2)], lb[:,int(cols/2):])) LS.append(ls) Finally, reconstruct the image from this joint pyramid. The program for the same is given below − ls_ = LS[0] for i in range(1,6): ls_ = cv2.pyrUp(ls_) ls_ = cv2.add(ls_, LS[i]) cv2.imshow(”RESULT”,ls_) Output The blended result should be as follows − Print Page Previous Next Advertisements ”;
OpenCV Python – Using Matplotlib ”; Previous Next Python’s Matplotlib is a powerful plotting library with a huge collection of plotting functions for the variety of plot types. It also has imshow() function to render an image. It gives additional facilities such as zooming, saving etc. Example Ensure that Matplotlib is installed in the current working environment before running the following program. import numpy as np import cv2 import matplotlib.pyplot as plt # Load an color image in grayscale img = cv2.imread(”OpenCV_Logo.png”,0) plt.imshow(img) plt.show() Output Print Page Previous Next Advertisements ”;
OpenCV Python – Morphological Transformations ”; Previous Next Simple operations on an image based on its shape are termed as morphological transformations. The two most common transformations are erosion and dilation. Erosion Erosion gets rid of the boundaries of the foreground object. Similar to 2D convolution, a kernel is slide across the image A. The pixel in the original image is retained, if all the pixels under the kernel are 1. Otherwise it is made 0 and thus, it causes erosion. All the pixels near the boundary are discarded. This process is useful for removing white noises. The command for the erode() function in OpenCV is as follows − cv.erode(src, kernel, dst, anchor, iterations) Parameters The erode() function in OpenCV uses following parameters − The src and dst parameters are input and output image arrays of the same size. Kernel is a matrix of structuring elements used for erosion. For example, 3X3 or 5X5. The anchor parameter is -1 by default which means the anchor element is at center. Iterations refers to the number of times erosion is applied. Dilation It is just the opposite of erosion. Here, a pixel element is 1, if at least one pixel under the kernel is 1. As a result, it increases the white region in the image. The command for the dilate() function is as follows − cv.dilate(src, kernel, dst, anchor, iterations) Parameters The dilate() function has the same parameters such as that of erode() function. Both functions can have additional optional parameters as BorderType and borderValue. BorderType is an enumerated type of image boundaries (CONSTANT, REPLICATE, TRANSPERANT etc.) borderValue is used in case of a constant border. By default, it is 0. Example Given below is an example program showing erode() and dilate() functions in use − import cv2 as cv import numpy as np img = cv.imread(”LinuxLogo.jpg”,0) kernel = np.ones((5,5),np.uint8) erosion = cv.erode(img,kernel,iterations = 1) dilation = cv.dilate(img,kernel,iterations = 1) cv.imshow(”Original”, img) cv.imshow(”Erosion”, erosion) cv.imshow(”Dialation”, dilation) Output Original Image Erosion Dilation Print Page Previous Next Advertisements ”;
OpenCV Python – Add Trackbar
OpenCV Python – Add Trackbar ”; Previous Next Trackbar in OpenCV is a slider control which helps in picking a value for the variable from a continuous range by manually sliding the tab over the bar. Position of the tab is synchronised with a value. The createTrackbar() function creates a Trackbar object with the following command − cv2.createTrackbar(trackbarname, winname, value, count, TrackbarCallback) In the following example, three trackbars are provided for the user to set values of R, G and B from the grayscale range 0 to 255. Using the track bar position values, a rectangle is drawn with the fill colour corresponding to RGB colour value. Example Following program is for adding a trackbar − import numpy as np import cv2 as cv img = np.zeros((300,400,3), np.uint8) cv.namedWindow(”image”) def nothing(x): pass # create trackbars for color change cv.createTrackbar(”R”,”image”,0,255,nothing) cv.createTrackbar(”G”,”image”,0,255,nothing) cv.createTrackbar(”B”,”image”,0,255,nothing) while(1): cv.imshow(”image”,img) k = cv.waitKey(1) & 0xFF if k == 27: break # get current positions of four trackbars r = cv.getTrackbarPos(”R”,”image”) g = cv.getTrackbarPos(”G”,”image”) b = cv.getTrackbarPos(”B”,”image”) #s = cv.getTrackbarPos(switch,”image”) #img[:] = [b,g,r] cv.rectangle(img, (100,100),(200,200), (b,g,r),-1) cv.destroyAllWindows() Output Print Page Previous Next Advertisements ”;
OpenCV Python – Resize and Rotate an Image ”; Previous Next In this chapter, we will learn how to resize and rotate an image with the help of OpenCVPython. Resize an Image It is possible to scale up or down an image with the use of cv2.resize() function. The resize() function is used as follows − resize(src, dsize, dst, fx, fy, interpolation) In general, interpolation is a process of estimating values between known data points. When graphical data contains a gap, but data is available on either side of the gap or at a few specific points within the gap. Interpolation allows us to estimate the values within the gap. In the above resize() function, interpolation flags determine the type of interpolation used for calculating size of destination image. Types of Interpolation The types of interpolation are as follows − INTER_NEAREST − A nearest-neighbor interpolation. INTER_LINEAR − A bilinear interpolation (used by default) INTER_AREA − Resampling using pixel area relation. It is a preferred method for image decimation but when the image is zoomed, it is similar to the INTER_NEAREST method. INTER_CUBIC − A bicubic interpolation over 4×4 pixel neighborhood INTER_LANCZOS4 − A Lanczos interpolation over 8×8 pixel neighborhood Preferable interpolation methods are cv2.INTER_AREA for shrinking and cv2.INTER_CUBIC (slow) & cv2.INTER_LINEAR for zooming. Example Following code resizes the ‘messi.jpg’ image to half its original height and width. import numpy as np import cv2 img = cv2.imread(”messi.JPG”,1) height, width = img.shape[:2] res = cv2.resize(img,(int(width/2), int(height/2)), interpolation = cv2.INTER_AREA) cv2.imshow(”image”,res) cv2.waitKey(0) cv2.destroyAllWindows() Output Rotate an image OpenCV uses affine transformation functions for operations on images such as translation and rotation. The affine transformation is a transformation that can be expressed in the form of a matrix multiplication (linear transformation) followed by a vector addition (translation). The cv2 module provides two functions cv2.warpAffine and cv2.warpPerspective, with which you can have all kinds of transformations. cv2.warpAffine takes a 2×3 transformation matrix while cv2.warpPerspective takes a 3×3 transformation matrix as input. To find this transformation matrix for rotation, OpenCV provides a function, cv2.getRotationMatrix2D, which is as follows − getRotationMatrix2D(center, angle, scale) We then apply the warpAffine function to the matrix returned by getRotationMatrix2D() function to obtain rotated image. Following program rotates the original image by 90 degrees without changing the dimensions − Example import numpy as np import cv2 img = cv2.imread(”OpenCV_Logo.png”,1) h, w = img.shape[:2] center = (w / 2, h / 2) mat = cv2.getRotationMatrix2D(center, 90, 1) rotimg = cv2.warpAffine(img, mat, (h, w)) cv2.imshow(”original”,img) cv2.imshow(”rotated”, rotimg) cv2.waitKey(0) cv2.destroyAllWindows() Output Original Image Rotated Image Print Page Previous Next Advertisements ”;
OpenCV Python – Capture Video from Camera ”; Previous Next By using the VideoCapture() function in OpenCV library, it is very easy to capture a live stream from a camera on the OpenCV window. This function needs a device index as the parameter. Your computer may have multiple cameras attached. They are enumerated by an index starting from 0 for built-in webcam. The function returns a VideoCapture object cam = cv.VideoCapture(0) After the camera is opened, we can read successive frames from it with the help of read() function ret,frame = cam.read() The read() function reads the next available frame and a return value (True/False). This frame is now rendered in desired color space with the cvtColor() function and displayed on the OpenCV window. img = cv.cvtColor(frame, cv.COLOR_BGR2RGB) # Display the resulting frame cv.imshow(”frame”, img) To capture the current frame to an image file, you can use imwrite() function. cv2.imwrite(“capture.png”, img) To save the live stream from camera to a video file, OpenCV provides a VideoWriter() function. cv.VideoWriter( filename, fourcc, fps, frameSize) The fourcc parameter is a standardized code for video codecs. OpenCV supports various codecs such as DIVX, XVID, MJPG, X264 etc. The fps anf framesize parameters depend on the video capture device. The VideoWriter() function returns a VideoWrite stream object, to which the grabbed frames are successively written in a loop. Finally, release the frame and VideoWriter objects to finalize the creation of video. Example Following example reads live feed from built-in webcam and saves it to ouput.avi file. import cv2 as cv cam = cv.VideoCapture(0) cc = cv.VideoWriter_fourcc(*”XVID”) file = cv.VideoWriter(”output.avi”, cc, 15.0, (640, 480)) if not cam.isOpened(): print(“error opening camera”) exit() while True: # Capture frame-by-frame ret, frame = cam.read() # if frame is read correctly ret is True if not ret: print(“error in retrieving frame”) break img = cv.cvtColor(frame, cv.COLOR_BGR2RGB) cv.imshow(”frame”, img) file.write(img) if cv.waitKey(1) == ord(”q”): break cam.release() file.release() cv.destroyAllWindows() Print Page Previous Next Advertisements ”;
OpenCV Python – Edge Detection ”; Previous Next An edge here means the boundary of an object in the image. OpenCV has a cv2.Canny() function that identifies the edges of various objects in an image by implementing Canny’s algorithm. Canny edge detection algorithm was developed by John Canny. According to it, object’s edges are determined by performing following steps − First step is to reduce the noisy pixels in the image. This is done by applying 5X5 Gaussian Filter. Second step involves finding the intensity gradient of the image. The smooth image of the first stage is filtered by applying the Sobel operator to obtain first order derivatives in horizontal and vertical directions (Gx and Gy). The root mean square value gives edge gradient and tan inverse ratio of derivatives gives the direction of edge. $$mathrm{Edge :gradient:G:=:sqrt{G_x^2+G_y^2}}$$ $$mathrm{Angle:theta:=:tan^{-1}(frac{G_{y}}{G_{x}})}$$ After getting gradient magnitude and direction, a full scan of the image is done to remove any unwanted pixels which may not constitute the edge. Next step is to perform hysteresis thresholding by using minval and maxval thresholds. Intensity gradients less than minval and maxval are non-edges so discarded. Those in between are treated as edge points or non-edges based on their connectivity. All these steps are performed by OpenCV’s cv2.Canny() function which needs the input image array and minval and maxval parameters. Example Here’s the example of canny edge detection. The program for the same is as follows − import numpy as np import cv2 as cv from matplotlib import pyplot as plt img = cv.imread(”lena.jpg”, 0) edges = cv.Canny(img,100,200) plt.subplot(121),plt.imshow(img,cmap = ”gray”) plt.title(”Original Image”), plt.xticks([]), plt.yticks([]) plt.subplot(122),plt.imshow(edges,cmap = ”gray”) plt.title(”Edges of original Image”), plt.xticks([]), plt.yticks([]) plt.show() Output Print Page Previous Next Advertisements ”;
OpenCV Python – Video from Images ”; Previous Next In the previous chapter, we have used the VideoWriter() function to save the live stream from a camera as a video file. To stitch multiple images into a video, we shall use the same function. First, ensure that all the required images are in a folder. Python’s glob() function in the built-in glob module builds an array of images so that we can iterate through it. Read the image object from the images in the folder and append to an image array. Following program explains how to stitch multiple images in a video − import cv2 import numpy as np import glob img_array = [] for filename in glob.glob(”*.png”): img = cv2.imread(filename) height, width, layers = img.shape size = (width,height) img_array.append(img) The create a video stream by using VideoWriter() function to write the contents of the image array to it. Given below is the program for the same. out = cv2.VideoWriter(”video.avi”,cv2.VideoWriter_fourcc(*”DIVX”), 15, size) for i in range(len(img_array)): out.write(img_array[i]) out.release() You should find the file named ‘video.avi’ in the current folder. Print Page Previous Next Advertisements ”;