OpenCV Python – Useful Resources ”; Previous Next The following resources contain additional information on OpenCV Python. Please use them to get more in-depth knowledge on this. Useful Video Courses OpenCV Complete Dummies Guide to Computer Vision With Python 71 Lectures 9 hours Abhilash Nelson More Detail Computer vision: OpenCV Fundamentals using Python 42 Lectures 4 hours Abhilash Nelson More Detail Learn Computer Vision with OpenCV and Python 49 Lectures 7.5 hours Ibrahim Delibasoglu More Detail Python – OpenCV & PyQT5 together 51 Lectures 8 hours Nico @softcademy More Detail Computer Vision Powered By Deep Learning, OpenCV And Python By Spotle.ai Best Seller 20 Lectures 2 hours Spotle Learn More Detail Practical OpenCV with Python from Zero to Hero 91 Lectures 6 hours Srikanth Guskra More Detail Print Page Previous Next Advertisements ”;
Category: opencv Python
OpenCV Python – Extract Images from Video ”; Previous Next A video is nothing but a sequence of frames and each frame is an image. By using OpenCV, all the frames that compose a video file can be extracted by executing imwrite() function till the end of video. The cv2.read() function returns the next available frame. The function also gives a return value which continues to be true till the end of stream. Here, a counter is incremented inside the loop and used as a file name. Following program demonstrates how to extract images from the video − import cv2 import os cam = cv2.VideoCapture(“video.avi”) frameno = 0 while(True): ret,frame = cam.read() if ret: # if video is still left continue creating images name = str(frameno) + ”.jpg” print (”new frame captured…” + name) cv2.imwrite(name, frame) frameno += 1 else: break cam.release() cv2.destroyAllWindows() Print Page Previous Next Advertisements ”;
OpenCV Python – Play Videos
OpenCV Python – Play Video from File ”; Previous Next The VideoCapture() function can also retrieve frames from a video file instead of a camera. Hence, we have only replaced the camera index with the video file’s name to be played on the OpenCV window. video=cv2.VideoCapture(file) While this should be enough to start rendering a video file, if it is accompanied by sound. The sound will not play along. For this purpose, you will need to install the ffpyplayer module. FFPyPlayer FFPyPlayer is a python binding for the FFmpeg library for playing and writing media files. To install, use pip installer utility by using the following command. pip3 install ffpyplayer The get_frame() method of the MediaPlayer object in this module returns the audio frame which will play along with each frame read from the video file. Following is the complete code for playing a video file along with its audio − import cv2 from ffpyplayer.player import MediaPlayer file=”video.mp4″ video=cv2.VideoCapture(file) player = MediaPlayer(file) while True: ret, frame=video.read() audio_frame, val = player.get_frame() if not ret: print(“End of video”) break if cv2.waitKey(1) == ord(“q”): break cv2.imshow(“Video”, frame) if val != ”eof” and audio_frame is not None: #audio img, t = audio_frame video.release() cv2.destroyAllWindows() Print Page Previous Next Advertisements ”;
OpenCV Python – Image Addition ”; Previous Next When an image is read by imread() function, the resultant image object is really a two or three dimensional matrix depending upon if the image is grayscale or RGB image. Hence, cv2.add() functions add two image matrices and returns another image matrix. Example Following code reads two images and performs their binary addition − kalam = cv2.imread(”kalam.jpg”) einst = cv2.imread(”einstein.jpg”) img = cv2.add(kalam, einst) cv2.imshow(”addition”, img) Result Instead of a linear binary addition, OpenCV has a addWeighted() function that performs weighted sum of two arrays. The command for the same is as follows Cv2.addWeighted(src1, alpha, src2, beta, gamma) Parameters The parameters of the addWeighted() function are as follows − src1 − First input array. alpha − Weight of the first array elements. src2 − Second input array of the same size and channel number as first beta − Weight of the second array elements. gamma − Scalar added to each sum. This function adds the images as per following equation − $$mathrm{g(x)=(1-alpha)f_{0}(x)+alpha f_{1}(x)}$$ The image matrices obtained in the above example are used to perform weighted sum. By varying a from 0 -> 1, a smooth transition takes place from one image to another, so that they blend together. First image is given a weight of 0.3 and the second image is given 0.7. The gamma factor is taken as 0. The command for addWeighted() function is as follows − img = cv2.addWeighted(kalam, 0.3, einst, 0.7, 0) It can be seen that the image addition is smoother compared to binary addition. Print Page Previous Next Advertisements ”;
OpenCV Python – Fourier Transform ”; Previous Next The Fourier Transform is used to transform an image from its spatial domain to its frequency domain by decomposing it into its sinus and cosines components. In case of digital images, a basic gray scale image values usually are between zero and 255. Therefore, the Fourier Transform too needs to be a Discrete Fourier Transform (DFT). It is used to find the frequency domain. Mathematically, Fourier Transform of a two dimensional image is represented as follows − $$mathrm{F(k,l)=displaystylesumlimits_{i=0}^{N-1}: displaystylesumlimits_{j=0}^{N-1} f(i,j):e^{-i2pi (frac{ki}{N},frac{lj}{N})}}$$ If the amplitude varies so fast in a short time, you can say it is a high frequency signal. If it varies slowly, it is a low frequency signal. In case of images, the amplitude varies drastically at the edge points, or noises. So edges and noises are high frequency contents in an image. If there are no much changes in amplitude, it is a low frequency component. OpenCV provides the functions cv.dft() and cv.idft() for this purpose. cv.dft() performs a Discrete Fourier transform of a 1D or 2D floating-point array. The command for the same is as follows − cv.dft(src, dst, flags) Here, src − Input array that could be real or complex. dst − Output array whose size and type depends on the flags. flags − Transformation flags, representing a combination of the DftFlags. cv.idft() calculates the inverse Discrete Fourier Transform of a 1D or 2D array. The command for the same is as follows − cv.idft(src, dst, flags) In order to obtain a discrete fourier transform, the input image is converted to np.float32 datatype. The transform obtained is then used to Shift the zero-frequency component to the center of the spectrum, from which magnitude spectrum is calculated. Example Given below is the program using Matplotlib, we plot the original image and magnitude spectrum − import numpy as np import cv2 as cv from matplotlib import pyplot as plt img = cv.imread(”lena.jpg”,0) dft = cv.dft(np.float32(img),flags = cv.DFT_COMPLEX_OUTPUT) dft_shift = np.fft.fftshift(dft) magnitude_spectrum = 20*np.log(cv.magnitude(dft_shift[:,:,0],dft_shift[:,:,1])) plt.subplot(121),plt.imshow(img, cmap = ”gray”) plt.title(”Input Image”), plt.xticks([]), plt.yticks([]) plt.subplot(122),plt.imshow(magnitude_spectrum, cmap = ”gray”) plt.title(”Magnitude Spectrum”), plt.xticks([]), plt.yticks([]) plt.show() Output Print Page Previous Next Advertisements ”;
OpenCV Python – Feature Detection ”; Previous Next In the context of image processing, features are mathematical representations of key areas in an image. They are the vector representations of the visual content from an image. Features make it possible to perform mathematical operations on them. Various computer vision applications include object detection, motion estimation, segmentation, image alignment etc. Prominent features in any image include edges, corners or parts of an image. OpenCV supports Haris corner detection and Shi-Tomasi corner detection algorithms. OpenCV library also provides functionality to implement SIFT (Scale-Invariant Feature Transform), SURF(Speeded-Up Robust Features) and FAST algorithm for corner detection. Harris and Shi-Tomasi algorithms are rotation-invariant. Even if the image is rotated, we can find the same corners. But when an image is scaled up, a corner may not be a corner if the image. The figure given below depicts the same. D.Lowe”s new algorithm, Scale Invariant Feature Transform (SIFT) extracts the key points and computes its descriptors. This is achieved by following steps − Scale-space Extrema Detection. Keypoint Localization. Orientation Assignment. Keypoint Descriptor. Keypoint Matching. As far as implementation of SIFT in OpenCV is concerned, it starts from loading an image and converting it into grayscale. The cv.SHIFT_create() function creates a SIFT object. Example Calling its detect() method obtains key points which are drawn on top of the original image. Following code implements this procedure import numpy as np import cv2 as cv img = cv.imread(”home.jpg”) gray= cv.cvtColor(img,cv.COLOR_BGR2GRAY) sift = cv.SIFT_create() kp = sift.detect(gray,None) img=cv.drawKeypoints(gray,kp,img) cv.imwrite(”keypoints.jpg”,img) Output The original image and the one with keypoints drawn are shown below − This is an original image. An image given below is the one with keypoints − Print Page Previous Next Advertisements ”;
OpenCV Python – Discussion
Discuss OpenCV Python ”; Previous Next OpenCV stands for Open Source Computer Vision and is a library of functions which is useful in real time computer vision application programming. OpenCV-Python is a Python wrapper around C++ implementation of OpenCV library. It is a rapid prototyping tool for computer vision problems. This tutorial is designed to give fluency in OpenCV-Python and to explain how you can use it in your applications. Print Page Previous Next Advertisements ”;
OpenCV Python – Meanshift and Camshift ”; Previous Next In this chapter, let us learn about the meanshift and the camshift in the OpenCV-Python. First, let us understand what is meanshift. Meanshift The mean shift algorithm identifies places in the data set with a high concentration of data points, or clusters. The algorithm places a kernel at each data point and sums them together to make a Kernel Density Estimation (KDE). The KDE will have places with a high and low data point density, respectfully. Meanshift is a very useful method to keep the track of a particular object inside a video. Every instance of the video is checked in the form of pixel distribution in that frame. An initial window as region of interest (ROI) is generally a square or a circle. For this, the positions are specified by hardcoding and the area of maximum pixel distribution is identified. The ROI window moves towards the region of maximum pixel distribution as the video runs. The direction of movement depends upon the difference between the center of our tracking window and the centroid of all the k-pixels inside that window. In order to use Meanshift in OpenCV, first, find the histogram (of which, only Hue is considered) of our target and can back project its target on each frame for calculation of Meanshift. We also need to provide an initial location of the ROI window. We repeatedly calculate the back projection of the histogram and calculate the Meanshift to get the new position of track window. Later on, we draw a rectangle using its dimensions on the frame. Functions The openCV functions used in the program are − cv.calcBackProject() − Calculates the back projection of a histogram. cv.meanShift() − Back projection of the object histogram using initial search window and Stop criteria for the iterative search algorithm. Example Here is the example program of Meanshift − import numpy as np import cv2 as cv cap = cv.VideoCapture(”traffic.mp4”) ret,frame = cap.read() # dimensions of initial location of window x, y, w, h = 300, 200, 100, 50 tracker = (x, y, w, h) region = frame[y:y+h, x:x+w] hsv_reg = cv.cvtColor(region, cv.COLOR_BGR2HSV) mask = cv.inRange(hsv_reg, np.array((0., 60.,32.)), np.array((180.,255.,255.))) reg_hist = cv.calcHist([hsv_reg],[0],mask,[180],[0,180]) cv.normalize(reg_hist,reg_hist,0,255,cv.NORM_MINMAX) # Setup the termination criteria criteria = ( cv.TERM_CRITERIA_EPS | cv.TERM_CRITERIA_COUNT, 10, 1 ) while(1): ret, frame = cap.read() if ret == True: hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV) dst = cv.calcBackProject([hsv],[0],reg_hist,[0,180],1) # apply meanshift ret, tracker = cv.meanShift(dst, tracker, criteria) # Draw it on image x,y,w,h = tracker img = cv.rectangle(frame, (x,y), (x+w,y+h), 255,2) cv.imshow(”img”,img) k = cv.waitKey(30) & 0xff if k==115: cv.imwrite(”capture.png”, img) if k == 27: break As the program is run, the Meanshift algorithm moves our window to the new location with maximum density. Output Here’s a snapshot of moving window − Camshift One of the disadvantages of Meanshift algorithm is that the size of the tracking window remains the same irrespective of the object”s distance from the camera. Also, the window will track the object only if it is in the region of that object. So, we must do manual hardcoding of the window and it should be done carefully. The solution to these problems is given by CAMshift (stands for Continuously Adaptive Meanshift). Once meanshift converges, the Camshift algorithm updates the size of the window such that the tracking window may change in size or even rotate to better correlate to the movements of the tracked object. In the following code, instead of meanshift() function, the camshift() function is used. First, it finds an object center using meanShift and then adjusts the window size and finds the optimal rotation. The function returns the object position, size, and orientation. The position is drawn on the frame by using polylines() draw function. Example Instead of Meanshift() function in earlier program, use CamShift() function as below − # apply camshift ret, tracker = cv.CamShift(dst, tracker, criteria) pts = cv.boxPoints(ret) pts = np.int0(pts) img = cv.polylines(frame,[pts],True, 255,2) cv.imshow(”img”,img) Output One snapshot of the result of modified program showing rotated rectangle of the tracking window is as follows − Print Page Previous Next Advertisements ”;
OpenCV Python – Quick Guide
OpenCV-Python – Quick Guide ”; Previous Next OpenCV Python – Overview OpenCV stands for Open Source Computer Vision and is a library of functions which is useful in real time computer vision application programming. The term Computer vision is used for a subject of performing the analysis of digital images and videos using a computer program. Computer vision is an important constituent of modern disciplines such as artificial intelligence and machine learning. Originally developed by Intel, OpenCV is a cross platform library written in C++ but also has a C Interface Wrappers for OpenCV which have been developed for many other programming languages such as Java and Python. In this tutorial, functionality of OpenCV’s Python library will be described. OpenCV-Python OpenCV-Python is a Python wrapper around C++ implementation of OpenCV library. It makes use of NumPy library for numerical operations and is a rapid prototyping tool for computer vision problems. OpenCV-Python is a cross-platform library, available for use on all Operating System (OS) platforms including, Windows, Linux, MacOS and Android. OpenCV also supports the Graphics Processing Unit (GPU) acceleration. This tutorial is designed for the computer science students and professionals who wish to gain expertise in the field of computer vision applications. Prior knowledge of Python and NumPy library is essential to understand the functionality of OpenCV-Python. OpenCV Python – Environment Setup In most of the cases, using pip should be sufficient to install OpenCV-Python on your computer. The command which is used to install pip is as follows − pip install opencv-python Performing this installation in a new virtual environment is recommended. The current version of OpenCV-Python is 4.5.1.48 and it can be verified by following command − >>> import cv2 >>> cv2.__version__ ”4.5.1” Since OpenCV-Python relies on NumPy, it is also installed automatically. Optionally, you may install Matplotlib for rendering certain graphical output. On Fedora, you may install OpenCV-Python by the below mentioned command − $ yum install numpy opencv* OpenCV-Python can also be installed by building from its source available at http://sourceforge.net Follow the installation instructions given for the same. OpenCV Python – Reading an image The CV2 package (name of OpenCV-Python library) provides the imread() function to read an image. The command to read an image is as follows − img=cv2.imread(filename, flags) The flags parameters are the enumeration of following constants − cv2.IMREAD_COLOR (1) − Loads a color image. cv2.IMREAD_GRAYSCALE (0) − Loads image in grayscale mode cv2.IMREAD_UNCHANGED (-1) − Loads image as such including alpha channel The function will return an image object, which can be rendered using imshow() function. The command for using imshow() function is given below − cv2.imshow(window-name, image) The image is displayed in a named window. A new window is created with the AUTOSIZE flag set. The WaitKey() is a keyboard binding function. Its argument is the time in milliseconds. The function waits for specified milliseconds and keeps the window on display till a key is pressed. Finally, we can destroy all the windows thus created. The function waits for specified milliseconds and keeps the window on display till a key is pressed. Finally, we can destroy all the windows thus created. The program to display the OpenCV logo is as follows − import numpy as np import cv2 # Load a color image in grayscale img = cv2.imread(”OpenCV_Logo.png”,1) cv2.imshow(”image”,img) cv2.waitKey(0) cv2.destroyAllWindows() The above program displays the OpenCV logo as follows − OpenCV Python – Write an image CV2 package has imwrite() function that saves an image object to a specified file. The command to save an image with the help of imwrite() function is as follows − cv2.imwrite(filename, img) The image format is automatically decided by OpenCV from the file extension. OpenCV supports *.bmp, *.dib , *.jpeg, *.jpg, *.png,*.webp, *.sr,*.tiff, *.tif etc. image file types. Example Following program loads OpenCV logo image and saves its greyscale version when ‘s’ key is pressed − import numpy as np import cv2 # Load an color image in grayscale img = cv2.imread(”OpenCV_Logo.png”,0) cv2.imshow(”image”,img) key=cv2.waitKey(0) if key==ord(”s”): cv2.imwrite(“opencv_logo_GS.png”, img) cv2.destroyAllWindows() Output OpenCV Python – Using Matplotlib Python’s Matplotlib is a powerful plotting library with a huge collection of plotting functions for the variety of plot types. It also has imshow() function to render an image. It gives additional facilities such as zooming, saving etc. Example Ensure that Matplotlib is installed in the current working environment before running the following program. import numpy as np import cv2 import matplotlib.pyplot as plt # Load an color image in grayscale img = cv2.imread(”OpenCV_Logo.png”,0) plt.imshow(img) plt.show() Output OpenCV Python – Image Properties OpenCV reads the image data in a NumPy array. The shape() method of this ndarray object reveals image properties such as dimensions and channels. The command to use the shape() method is as follows − >>> img = cv.imread(“OpenCV_Logo.png”, 1) >>> img.shape (222, 180, 3) In the above command − The first two items shape[0] and shape[1] represent width and height of the image. Shape[2] stands for a number of channels. 3 indicates that the image has Red Green Blue (RGB) channels. Similarly, the size property returns the size of the image. The command for the size of an image is as follows − >>> img.size 119880 Each element in the ndarray represents one image pixel. We can access and manipulate any pixel’s value, with the help of the command mentioned below. >>> p=img[50,50] >>> p array([ 1, 1, 255], dtype=uint8) Example Following code changes the color value of the first 100X100 pixels to black. The imshow() function can verify the result. >>> for i in range(100): for j in range(100): img[i,j]=[0,0,0] Output The image channels can be split in individual planes by using the split() function. The channels can be merged by using merge() function. The split() function returns a multi-channel array. We can use the following command to split the image channels − >>> img = cv.imread(“OpenCV_Logo.png”, 1) >>> b,g,r = cv.split(img) You can now perform manipulation on each plane. Suppose we set all pixels in blue channel to 0,
OpenCV Python – Digit Recognition with KNN ”; Previous Next KNN which stands for K-Nearest Neighbour is a Machine Learning algorithm based on Supervised Learning. It tries to put a new data point into the category that is most similar to the available categories. All the available data is classified into distinct categories and a new data point is put in one of them based on the similarity. The KNN algorithm works on following principle − Choose preferably an odd number as K for the number of neighbours to be checked. Calculate their Euclidean distance. Take the K nearest neighbors as per the calculated Euclidean distance. count the number of the data points in each category. Category with maximum data points is the category in which the new data point is classified. As an example of implementation of KNN algorithm using OpenCV, we shall use the following image digits.png consisting of 5000 images of handwritten digits, each of 20X20 pixels. First task is to split this image into 5000 digits. This is our feature set. Convert it to a NumPy array. The program is given below − import numpy as np import cv2 image = cv2.imread(”digits.png”) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) fset=[] for i in np.vsplit(gray,50): x=np.hsplit(i,100) fset.append(x) NP_array = np.array(fset) Now we divide this data in training set and testing set, each of size (2500,20×20) as follows − trainset = NP_array[:,:50].reshape(-1,400).astype(np.float32) testset = NP_array[:,50:100].reshape(-1,400).astype(np.float32) Next, we have to create 10 different labels for each digit, as shown below − k = np.arange(10) train_labels = np.repeat(k,250)[:,np.newaxis] test_labels = np.repeat(k,250)[:,np.newaxis] We are now in a position to start the KNN classification. Create the classifier object and train the data. knn = cv2.ml.KNearest_create() knn.train(trainset, cv2.ml.ROW_SAMPLE, train_labels) Choosing the value of k as 3, obtain the output of the classifier. ret, output, neighbours, distance = knn.findNearest(testset, k = 3) Compare the output with test labels to check the performance and accuracy of the classifier. The program shows an accuracy of 91.64% in detecting the handwritten digit accurately. result = output==test_labels correct = np.count_nonzero(result) accuracy = (correct*100.0)/(output.size) print(accuracy) Print Page Previous Next Advertisements ”;