DIP – Optical Character Recognition

Optical Character Recognition ”; Previous Next Optical character recognition is usually abbreviated as OCR. It includes the mechanical and electrical conversion of scanned images of handwritten, typewritten text into machine text. It is common method of digitizing printed texts so that they can be electronically searched, stored more compactly, displayed on line, and used in machine processes such as machine translation, text to speech and text mining. In recent years, OCR (Optical Character Recognition) technology has been applied throughout the entire spectrum of industries, revolutionizing the document management process. OCR has enabled scanned documents to become more than just image files, turning into fully searchable documents with text content that is recognized by computers. With the help of OCR, people no longer need to manually retype important documents when entering them into electronic databases. Instead, OCR extracts relevant information and enters it automatically. The result is accurate, efficient information processing in less time. Optical character recognition has multiple research areas but the most common areas are as following: Banking he uses of OCR vary across different fields. One widely known application is in banking, where OCR is used to process checks without human involvement. A check can be inserted into a machine, the writing on it is scanned instantly, and the correct amount of money is transferred. This technology has nearly been perfected for printed checks, and is fairly accurate for handwritten checks as well, though it occasionally requires manual confirmation. Overall, this reduces wait times in many banks. Blind and visually impaired persons One of the major factors in the beginning of research behind the OCR is that scientist want to make a computer or device which could read book to the blind people out loud. On this research scientist made flatbed scanner which is most commonly known to us as document scanner. Legal department In the legal industry, there has also been a significant movement to digitize paper documents. In order to save space and eliminate the need to sift through boxes of paper files, documents are being scanned and entered into computer databases. OCR further simplifies the process by making documents text-searchable, so that they are easier to locate and work with once in the database. Legal professionals now have fast, easy access to a huge library of documents in electronic format, which they can find simply by typing in a few keywords. Retail Industry Barcode recognition technology is also related to OCR. We see the use of this technology in our common day use. Other Uses OCR is widely used in many other fields, including education, finance, and government agencies. OCR has made countless texts available online, saving money for students and allowing knowledge to be shared. Invoice imaging applications are used in many businesses to keep track of financial records and prevent a backlog of payments from piling up. In government agencies and independent organizations, OCR simplifies data collection and analysis, among other processes. As the technology continues to develop, more and more applications are found for OCR technology, including increased use of handwriting recognition. Print Page Previous Next Advertisements ”;

DIP – Useful Resources

Digital Image Processing – Useful Resources ”; Previous Next The following resources contain additional information on Digital Image Processing. Please use them to get more in-depth knowledge on this topic. Useful Video Courses Image Management 13 Lectures 2.5 hours Tutorialspoint More Detail Image SEO Made Simple: Google Search Engine Growth Hacking Featured 34 Lectures 2.5 hours Sasha Miller More Detail Upload and Image Processing with Laravel and DigitalOcean 14 Lectures 3 hours Sebastian Sulinski More Detail Image Processing Toolbox in MATLAB 18 Lectures 3 hours Sanjeev More Detail Image Processing Masterclass in Python For Beginners In 2021 17 Lectures 2 hours Emenwa Global, Ejike IfeanyiChukwu More Detail Digital Image Processing Course using MATLAB 23 Lectures 4 hours TELCOMA Global More Detail Print Page Previous Next Advertisements ”;

DIP – Concept of convolution

Concept of Convolution ”; Previous Next This tutorial is about one of the very important concept of signals and system. We will completely discuss convolution. What is it? Why is it? What can we achieve with it? We will start discussing convolution from the basics of image processing. What is image processing As we have discussed in the introduction to image processing tutorials and in the signal and system that image processing is more or less the study of signals and systems because an image is nothing but a two dimensional signal. Also we have discussed, that in image processing , we are developing a system whose input is an image and output would be an image. This is pictorially represented as. The box is that is shown in the above figure labeled as “Digital Image Processing system” could be thought of as a black box It can be better represented as: Where have we reached until now Till now we have discussed two important methods to manipulate images. Or in other words we can say that, our black box works in two different ways till now. The two different ways of manipulating images were Graphs (Histograms) This method is known as histogram processing. We have discussed it in detail in previous tutorials for increase contrast, image enhancement, brightness e.t.c Transformation functions This method is known as transformations, in which we discussed different type of transformations and some gray level transformations Another way of dealing images Here we are going to discuss another method of dealing with images. This other method is known as convolution. Usually the black box(system) used for image processing is an LTI system or linear time invariant system. By linear we mean that such a system where output is always linear , neither log nor exponent or any other. And by time invariant we means that a system which remains same during time. So now we are going to use this third method. It can be represented as. It can be mathematically represented as two ways g(x,y) = h(x,y) * f(x,y) It can be explained as the “mask convolved with an image”. Or g(x,y) = f(x,y) * h(x,y) It can be explained as “image convolved with mask”. There are two ways to represent this because the convolution operator(*) is commutative. The h(x,y) is the mask or filter. What is mask? Mask is also a signal. It can be represented by a two dimensional matrix. The mask is usually of the order of 1×1, 3×3, 5×5, 7×7 . A mask should always be in odd number, because other wise you cannot find the mid of the mask. Why do we need to find the mid of the mask. The answer lies below, in topic of, how to perform convolution? How to perform convolution? In order to perform convolution on an image, following steps should be taken. Flip the mask (horizontally and vertically) only once Slide the mask onto the image. Multiply the corresponding elements and then add them Repeat this procedure until all values of the image has been calculated. Example of convolution Let’s perform some convolution. Step 1 is to flip the mask. Mask Let’s take our mask to be this. 1 2 3 4 5 6 7 8 9 Flipping the mask horizontally 3 2 1 6 5 4 9 8 7 Flipping the mask vertically 9 8 7 6 5 4 3 2 1 Image Let’s consider an image to be like this 2 4 6 8 10 12 14 16 18 Convolution Convolving mask over image. It is done in this way. Place the center of the mask at each element of an image. Multiply the corresponding elements and then add them , and paste the result onto the element of the image on which you place the center of mask. The box in red color is the mask, and the values in the orange are the values of the mask. The black color box and values belong to the image. Now for the first pixel of the image, the value will be calculated as First pixel = (5*2) + (4*4) + (2*8) + (1*10) = 10 + 16 + 16 + 10 = 52 Place 52 in the original image at the first index and repeat this procedure for each pixel of the image. Why Convolution Convolution can achieve something, that the previous two methods of manipulating images can’t achieve. Those include the blurring, sharpening, edge detection, noise reduction e.t.c. Print Page Previous Next Advertisements ”;

DIP – Prewitt Operator

Prewitt Operator ”; Previous Next Prewitt operator is used for edge detection in an image. It detects two types of edges Horizontal edges Vertical Edges Edges are calculated by using difference between corresponding pixel intensities of an image. All the masks that are used for edge detection are also known as derivative masks. Because as we have stated many times before in this series of tutorials that image is also a signal so changes in a signal can only be calculated using differentiation. So that’s why these operators are also called as derivative operators or derivative masks. All the derivative masks should have the following properties: Opposite sign should be present in the mask. Sum of mask should be equal to zero. More weight means more edge detection. Prewitt operator provides us two masks one for detecting edges in horizontal direction and another for detecting edges in an vertical direction. Vertical direction -1 0 1 -1 0 1 -1 0 1 Above mask will find the edges in vertical direction and it is because the zeros column in the vertical direction. When you will convolve this mask on an image, it will give you the vertical edges in an image. How it works When we apply this mask on the image it prominent vertical edges. It simply works like as first order derivate and calculates the difference of pixel intensities in a edge region. As the center column is of zero so it does not include the original values of an image but rather it calculates the difference of right and left pixel values around that edge. This increase the edge intensity and it become enhanced comparatively to the original image. Horizontal Direction -1 -1 -1 0 0 0 1 1 1 Above mask will find edges in horizontal direction and it is because that zeros column is in horizontal direction. When you will convolve this mask onto an image it would prominent horizontal edges in the image. How it works This mask will prominent the horizontal edges in an image. It also works on the principle of above mask and calculates difference among the pixel intensities of a particular edge. As the center row of mask is consist of zeros so it does not include the original values of edge in the image but rather it calculate the difference of above and below pixel intensities of the particular edge. Thus increasing the sudden change of intensities and making the edge more visible. Both the above masks follow the principle of derivate mask. Both masks have opposite sign in them and both masks sum equals to zero. The third condition will not be applicable in this operator as both the above masks are standardize and we can’t change the value in them. Now it’s time to see these masks in action: Sample Image Following is a sample picture on which we will apply above two masks one at time. After applying Vertical Mask After applying vertical mask on the above sample image, following image will be obtained. This image contains vertical edges. You can judge it more correctly by comparing with horizontal edges picture. After applying Horizontal Mask After applying horizontal mask on the above sample image, following image will be obtained. Comparison As you can see that in the first picture on which we apply vertical mask, all the vertical edges are more visible than the original image. Similarly in the second picture we have applied the horizontal mask and in result all the horizontal edges are visible. So in this way you can see that we can detect both horizontal and vertical edges from an image. Print Page Previous Next Advertisements ”;

DIP – Convolution theorm

Convolution Theorem ”; Previous Next In the last tutorial, we discussed about the images in frequency domain. In this tutorial, we are going to define a relationship between frequency domain and the images(spatial domain). For example Consider this example. The same image in the frequency domain can be represented as. Now what’s the relationship between image or spatial domain and frequency domain. This relationship can be explained by a theorem which is called as Convolution theorem. Convolution Theorem The relationship between the spatial domain and the frequency domain can be established by convolution theorem. The convolution theorem can be represented as. It can be stated as the convolution in spatial domain is equal to filtering in frequency domain and vice versa. The filtering in frequency domain can be represented as following: The steps in filtering are given below. At first step we have to do some pre – processing an image in spatial domain, means increase its contrast or brightness Then we will take discrete Fourier transform of the image Then we will center the discrete Fourier transform, as we will bring the discrete Fourier transform in center from corners Then we will apply filtering, means we will multiply the Fourier transform by a filter function Then we will again shift the DFT from center to the corners Last step would be take to inverse discrete Fourier transform, to bring the result back from frequency domain to spatial domain And this step of post processing is optional, just like pre processing , in which we just increase the appearance of image. Filters The concept of filter in frequency domain is same as the concept of a mask in convolution. After converting an image to frequency domain, some filters are applied in filtering process to perform different kind of processing on an image. The processing include blurring an image, sharpening an image e.t.c. The common type of filters for these purposes are: Ideal high pass filter Ideal low pass filter Gaussian high pass filter Gaussian low pass filter In the next tutorial, we will discuss about filter in detail. Print Page Previous Next Advertisements ”;

DIP – Introduction to Color Spaces

Introduction to Color Spaces ”; Previous Next In this tutorial, we are going to talk about color spaces. What are color spaces? Color spaces are different types of color modes, used in image processing and signals and system for various purposes. Some of the common color spaces are: RGB CMY’K Y’UV YIQ Y’CbCr HSV RGB RGB is the most widely used color space, and we have already discussed it in the past tutorials. RGB stands for red green and blue. What RGB model states, that each color image is actually formed of three different images. Red image, Blue image, and black image. A normal grayscale image can be defined by only one matrix, but a color image is actually composed of three different matrices. One color image matrix = red matrix + blue matrix + green matrix This can be best seen in this example below. Applications of RGB The common applications of RGB model are Cathode ray tube (CRT) Liquid crystal display (LCD) Plasma Display or LED display such as a television A compute monitor or a large scale screen CMYK RGB to CMY conversion The conversion from RGB to CMY is done using this method. Consider you have an color image , means you have three different arrays of RED, GREEN and BLUE. Now if you want to convert it into CMY, here’s what you have to do. You have to subtract it by the maximum number of levels – 1. Each matrix is subtracted and its respective CMY matrix is filled with result. Y’UV Y’UV defines a color space in terms of one luma (Y’) and two chrominance (UV) components. The Y’UV color model is used in the following composite color video standards. NTSC ( National Television System Committee) PAL (Phase Alternating Line) SECAM (Sequential couleur a amemoire, French for “sequential color with memory) Y’CbCr Y’CbCr color model contains Y’, the luma component and cb and cr are the blue-difference and red difference chroma components. It is not an absolute color space. It is mainly used for digital systems Its common applications include JPEG and MPEG compression. Y’UV is often used as the term for Y’CbCr, however they are totally different formats. The main difference between these two is that the former is analog while the later is digital. Print Page Previous Next Advertisements ”;

DIP – Histogram Equalization

Histogram Equalization ”; Previous Next We have already seen that contrast can be increased using histogram stretching. In this tutorial we will see that how histogram equalization can be used to enhance contrast. Before performing histogram equalization, you must know two important concepts used in equalizing histograms. These two concepts are known as PMF and CDF. They are discussed in our tutorial of PMF and CDF. Please visit them in order to successfully grasp the concept of histogram equalization. Histogram Equalization Histogram equalization is used to enhance contrast. It is not necessary that contrast will always be increase in this. There may be some cases were histogram equalization can be worse. In that cases the contrast is decreased. Lets start histogram equalization by taking this image below as a simple image. Image Histogram of this image The histogram of this image has been shown below. Now we will perform histogram equalization to it. PMF First we have to calculate the PMF (probability mass function) of all the pixels in this image. If you donot know how to calculate PMF, please visit our tutorial of PMF calculation. CDF Our next step involves calculation of CDF (cumulative distributive function). Again if you donot know how to calculate CDF , please visit our tutorial of CDF calculation. Calculate CDF according to gray levels Lets for instance consider this , that the CDF calculated in the second step looks like this. Gray Level Value CDF 0 0.11 1 0.22 2 0.55 3 0.66 4 0.77 5 0.88 6 0.99 7 1 Then in this step you will multiply the CDF value with (Gray levels (minus) 1) . Considering we have an 3 bpp image. Then number of levels we have are 8. And 1 subtracts 8 is 7. So we multiply CDF by 7. Here what we got after multiplying. Gray Level Value CDF CDF * (Levels-1) 0 0.11 0 1 0.22 1 2 0.55 3 3 0.66 4 4 0.77 5 5 0.88 6 6 0.99 6 7 1 7 Now we have is the last step, in which we have to map the new gray level values into number of pixels. Lets assume our old gray levels values has these number of pixels. Gray Level Value Frequency 0 2 1 4 2 6 3 8 4 10 5 12 6 14 7 16 Now if we map our new values to , then this is what we got. Gray Level Value New Gray Level Value Frequency 0 0 2 1 1 4 2 3 6 3 4 8 4 5 10 5 6 12 6 6 14 7 7 16 Now map these new values you are onto histogram, and you are done. Lets apply this technique to our original image. After applying we got the following image and its following histogram. Histogram Equalization Image Cumulative Distributive function of this image Histogram Equalization histogram Comparing both the histograms and images Conclusion As you can clearly see from the images that the new image contrast has been enhanced and its histogram has also been equalized. There is also one important thing to be note here that during histogram equalization the overall shape of the histogram changes, where as in histogram stretching the overall shape of histogram remains same. Print Page Previous Next Advertisements ”;

DIP – Brightness and Contrast

Brightness and Contrast ”; Previous Next Brightness Brightness is a relative term. It depends on your visual perception. Since brightness is a relative term, so brightness can be defined as the amount of energy output by a source of light relative to the source we are comparing it to. In some cases we can easily say that the image is bright, and in some cases, its not easy to perceive. For example Just have a look at both of these images, and compare which one is brighter. We can easily see, that the image on the right side is brighter as compared to the image on the left. But if the image on the right is made more darker then the first one, then we can say that the image on the left is more brighter then the left. How to make an image brighter. Brightness can be simply increased or decreased by simple addition or subtraction, to the image matrix. Consider this black image of 5 rows and 5 columns Since we already know, that each image has a matrix at its behind that contains the pixel values. This image matrix is given below. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Since the whole matrix is filled with zero, and the image is very much darker. Now we will compare it with another same black image to see this image got brighter or not. Still, both images look the same. To make the second one brighter we just need to add the value 1 to each element in the matrix representing it. What we will do is, that we will simply add a value of 1 to each of the matrix value of image 1. After adding the image 1 would something like this. Now we will again compare it with image 2, and see any difference. We see, that still we cannot tell which image is brighter as both images looks the same. Now what we will do, is that we will add 50 to each of the matrix value of the image 1 and see what the image has become. The output is given below. Now again, we will compare it with image 2. Now you can see that the image 1 is slightly brighter then the image 2. We go on, and add another 45 value to its matrix of image 1, and this time we compare again both images. Now when you compare it, you can see that this image1 is clearly brighter then the image 2. Even it is brighter then the old image1. At this point the matrix of the image1 contains 100 at each index as first add 5, then 50, then 45. So 5 + 50 + 45 = 100. Contrast Contrast can be simply explained as the difference between maximum and minimum pixel intensity in an image. For example. Consider the final image1 in brightness. The matrix of this image is: 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 The maximum value in this matrix is 100. The minimum value in this matrix is 100. Contrast = maximum pixel intensity(subtracted by) minimum pixel intensity = 100 (subtracted by) 100 = 0 0 means that this image has 0 contrast. Print Page Previous Next Advertisements ”;

DIP – Computer Vision and Graphics

Computer Vision and Computer Graphics ”; Previous Next Computer Vision Computer vision is concerned with modeling and replicating human vision using computer software and hardware. Formally if we define computer vision then its definition would be that computer vision is a discipline that studies how to reconstruct, interrupt and understand a 3d scene from its 2d images in terms of the properties of the structure present in scene. It needs knowledge from the following fields in order to understand and stimulate the operation of human vision system. Computer Science Electrical Engineering Mathematics Physiology Biology Cognitive Science Computer Vision Hierarchy Computer vision is divided into three basic categories that are as following: Low-level vision: includes process image for feature extraction. Intermediate-level vision: includes object recognition and 3D scene Interpretation High-level vision: includes conceptual description of a scene like activity, intention and behavior. Related Fields Computer Vision overlaps significantly with the following fields: Image Processing: it focuses on image manipulation. Pattern Recognition: it studies various techniques to classify patterns. Photogrammetry: it is concerned with obtaining accurate measurements from images. Computer Vision Vs Image Processing Image processing studies image to image transformation. The input and output of image processing are both images. Computer vision is the construction of explicit, meaningful descriptions of physical objects from their image. The output of computer vision is a description or an interpretation of structures in 3D scene. Example Applications Robotics Medicine Security Transportation Industrial Automation Robotics Application Localization-determine robot location automatically Navigation Obstacles avoidance Assembly (peg-in-hole, welding, painting) Manipulation (e.g. PUMA robot manipulator) Human Robot Interaction (HRI): Intelligent robotics to interact with and serve people Medicine Application Classification and detection (e.g. lesion or cells classification and tumor detection) 2D/3D segmentation 3D human organ reconstruction (MRI or ultrasound) Vision-guided robotics surgery Industrial Automation Application Industrial inspection (defect detection) Assembly Barcode and package label reading Object sorting Document understanding (e.g. OCR) Security Application Biometrics (iris, finger print, face recognition) Surveillance-detecting certain suspicious activities or behaviors Transportation Application Autonomous vehicle Safety, e.g., driver vigilance monitoring Computer Graphics Computer graphics are graphics created using computers and the representation of image data by a computer specifically with help from specialized graphic hardware and software. Formally we can say that Computer graphics is creation, manipulation and storage of geometric objects (modeling) and their images (Rendering). The field of computer graphics developed with the emergence of computer graphics hardware. Today computer graphics is use in almost every field. Many powerful tools have been developed to visualize data. Computer graphics field become more popular when companies started using it in video games. Today it is a multibillion dollar industry and main driving force behind the computer graphics development. Some common applications areas are as following: Computer Aided Design (CAD) Presentation Graphics 3d Animation Education and training Graphical User Interfaces Computer Aided Design Used in design of buildings, automobiles, aircraft and many other product Use to make virtual reality system. Presentation Graphics Commonly used to summarize financial, statistical data Use to generate slides 3d Animation Used heavily in the movie industry by companies such as Pixar, DresmsWorks To add special effects in games and movies. Education and training Computer generated models of physical systems Medical Visualization 3D MRI Dental and bone scans Stimulators for training of pilots etc. Graphical User Interfaces It is used to make graphical user interfaces objects like buttons, icons and other components Print Page Previous Next Advertisements ”;

DIP – Histogram Sliding

Histogram Sliding ”; Previous Next The basic concept of histograms has been discussed in the tutorial of Introduction to histograms. But we will briefly introduce the histogram here. Histogram Histogram is nothing but a graph that shows frequency of occurrence of data. Histograms has many use in image processing, out of which we are going to discuss one user here which is called histogram sliding. Histogram sliding In histogram sliding, we just simply shift a complete histogram rightwards or leftwards. Due to shifting or sliding of histogram towards right or left, a clear change can be seen in the image.In this tutorial we are going to use histogram sliding for manipulating brightness. The term i-e: Brightness has been discussed in our tutorial of introduction to brightness and contrast. But we are going to briefly define here. Brightness Brightness is a relative term. Brightness can be defined as intensity of light emit by a particular light source. Contrast Contrast can be defined as the difference between maximum and minimum pixel intensity in an image. Sliding Histograms Increasing brightness using histogram sliding Histogram of this image has been shown below. On the y axis of this histogram are the frequency or count. And on the x axis, we have gray level values. As you can see from the above histogram, that those gray level intensities whose count is more then 700, lies in the first half portion, means towards blacker portion. Thats why we got an image that is a bit darker. In order to bright it, we will slide its histogram towards right, or towards whiter portion. In order to do we need to add atleast a value of 50 to this image. Because we can see from the histogram above, that this image also has 0 pixel intensities, that are pure black. So if we add 0 to 50, we will shift all the values lies at 0 intensity to 50 intensity and all the rest of the values will be shifted accordingly. Lets do it. Here what we got after adding 50 to each pixel intensity. The image has been shown below. And its histogram has been shown below. Lets compare these two images and their histograms to see that what change have to got. Conclusion As we can clearly see from the new histogram that all the pixels values has been shifted towards right and its effect can be seen in the new image. Decreasing brightness using histogram sliding Now if we were to decrease brightness of this new image to such an extent that the old image look brighter, we got to subtract some value from all the matrix of the new image. The value which we are going to subtract is 80. Because we already add 50 to the original image and we got a new brighter image, now if we want to make it darker, we have to subtract at least more than 50 from it. And this what we got after subtracting 80 from the new image. Conclusion It is clear from the histogram of the new image, that all the pixel values has been shifted towards right and thus, it can be validated from the image that new image is darker and now the original image look brighter as compare to this new image. Print Page Previous Next Advertisements ”;