DIP – Optical Character Recognition

Optical Character Recognition ”; Previous Next Optical character recognition is usually abbreviated as OCR. It includes the mechanical and electrical conversion of scanned images of handwritten, typewritten text into machine text. It is common method of digitizing printed texts so that they can be electronically searched, stored more compactly, displayed on line, and used in machine processes such as machine translation, text to speech and text mining. In recent years, OCR (Optical Character Recognition) technology has been applied throughout the entire spectrum of industries, revolutionizing the document management process. OCR has enabled scanned documents to become more than just image files, turning into fully searchable documents with text content that is recognized by computers. With the help of OCR, people no longer need to manually retype important documents when entering them into electronic databases. Instead, OCR extracts relevant information and enters it automatically. The result is accurate, efficient information processing in less time. Optical character recognition has multiple research areas but the most common areas are as following: Banking he uses of OCR vary across different fields. One widely known application is in banking, where OCR is used to process checks without human involvement. A check can be inserted into a machine, the writing on it is scanned instantly, and the correct amount of money is transferred. This technology has nearly been perfected for printed checks, and is fairly accurate for handwritten checks as well, though it occasionally requires manual confirmation. Overall, this reduces wait times in many banks. Blind and visually impaired persons One of the major factors in the beginning of research behind the OCR is that scientist want to make a computer or device which could read book to the blind people out loud. On this research scientist made flatbed scanner which is most commonly known to us as document scanner. Legal department In the legal industry, there has also been a significant movement to digitize paper documents. In order to save space and eliminate the need to sift through boxes of paper files, documents are being scanned and entered into computer databases. OCR further simplifies the process by making documents text-searchable, so that they are easier to locate and work with once in the database. Legal professionals now have fast, easy access to a huge library of documents in electronic format, which they can find simply by typing in a few keywords. Retail Industry Barcode recognition technology is also related to OCR. We see the use of this technology in our common day use. Other Uses OCR is widely used in many other fields, including education, finance, and government agencies. OCR has made countless texts available online, saving money for students and allowing knowledge to be shared. Invoice imaging applications are used in many businesses to keep track of financial records and prevent a backlog of payments from piling up. In government agencies and independent organizations, OCR simplifies data collection and analysis, among other processes. As the technology continues to develop, more and more applications are found for OCR technology, including increased use of handwriting recognition. Print Page Previous Next Advertisements ”;

DIP – Useful Resources

Digital Image Processing – Useful Resources ”; Previous Next The following resources contain additional information on Digital Image Processing. Please use them to get more in-depth knowledge on this topic. Useful Video Courses Image Management 13 Lectures 2.5 hours Tutorialspoint More Detail Image SEO Made Simple: Google Search Engine Growth Hacking Featured 34 Lectures 2.5 hours Sasha Miller More Detail Upload and Image Processing with Laravel and DigitalOcean 14 Lectures 3 hours Sebastian Sulinski More Detail Image Processing Toolbox in MATLAB 18 Lectures 3 hours Sanjeev More Detail Image Processing Masterclass in Python For Beginners In 2021 17 Lectures 2 hours Emenwa Global, Ejike IfeanyiChukwu More Detail Digital Image Processing Course using MATLAB 23 Lectures 4 hours TELCOMA Global More Detail Print Page Previous Next Advertisements ”;

DIP – Concept of convolution

Concept of Convolution ”; Previous Next This tutorial is about one of the very important concept of signals and system. We will completely discuss convolution. What is it? Why is it? What can we achieve with it? We will start discussing convolution from the basics of image processing. What is image processing As we have discussed in the introduction to image processing tutorials and in the signal and system that image processing is more or less the study of signals and systems because an image is nothing but a two dimensional signal. Also we have discussed, that in image processing , we are developing a system whose input is an image and output would be an image. This is pictorially represented as. The box is that is shown in the above figure labeled as “Digital Image Processing system” could be thought of as a black box It can be better represented as: Where have we reached until now Till now we have discussed two important methods to manipulate images. Or in other words we can say that, our black box works in two different ways till now. The two different ways of manipulating images were Graphs (Histograms) This method is known as histogram processing. We have discussed it in detail in previous tutorials for increase contrast, image enhancement, brightness e.t.c Transformation functions This method is known as transformations, in which we discussed different type of transformations and some gray level transformations Another way of dealing images Here we are going to discuss another method of dealing with images. This other method is known as convolution. Usually the black box(system) used for image processing is an LTI system or linear time invariant system. By linear we mean that such a system where output is always linear , neither log nor exponent or any other. And by time invariant we means that a system which remains same during time. So now we are going to use this third method. It can be represented as. It can be mathematically represented as two ways g(x,y) = h(x,y) * f(x,y) It can be explained as the “mask convolved with an image”. Or g(x,y) = f(x,y) * h(x,y) It can be explained as “image convolved with mask”. There are two ways to represent this because the convolution operator(*) is commutative. The h(x,y) is the mask or filter. What is mask? Mask is also a signal. It can be represented by a two dimensional matrix. The mask is usually of the order of 1×1, 3×3, 5×5, 7×7 . A mask should always be in odd number, because other wise you cannot find the mid of the mask. Why do we need to find the mid of the mask. The answer lies below, in topic of, how to perform convolution? How to perform convolution? In order to perform convolution on an image, following steps should be taken. Flip the mask (horizontally and vertically) only once Slide the mask onto the image. Multiply the corresponding elements and then add them Repeat this procedure until all values of the image has been calculated. Example of convolution Let’s perform some convolution. Step 1 is to flip the mask. Mask Let’s take our mask to be this. 1 2 3 4 5 6 7 8 9 Flipping the mask horizontally 3 2 1 6 5 4 9 8 7 Flipping the mask vertically 9 8 7 6 5 4 3 2 1 Image Let’s consider an image to be like this 2 4 6 8 10 12 14 16 18 Convolution Convolving mask over image. It is done in this way. Place the center of the mask at each element of an image. Multiply the corresponding elements and then add them , and paste the result onto the element of the image on which you place the center of mask. The box in red color is the mask, and the values in the orange are the values of the mask. The black color box and values belong to the image. Now for the first pixel of the image, the value will be calculated as First pixel = (5*2) + (4*4) + (2*8) + (1*10) = 10 + 16 + 16 + 10 = 52 Place 52 in the original image at the first index and repeat this procedure for each pixel of the image. Why Convolution Convolution can achieve something, that the previous two methods of manipulating images can’t achieve. Those include the blurring, sharpening, edge detection, noise reduction e.t.c. Print Page Previous Next Advertisements ”;

DIP – Prewitt Operator

Prewitt Operator ”; Previous Next Prewitt operator is used for edge detection in an image. It detects two types of edges Horizontal edges Vertical Edges Edges are calculated by using difference between corresponding pixel intensities of an image. All the masks that are used for edge detection are also known as derivative masks. Because as we have stated many times before in this series of tutorials that image is also a signal so changes in a signal can only be calculated using differentiation. So that’s why these operators are also called as derivative operators or derivative masks. All the derivative masks should have the following properties: Opposite sign should be present in the mask. Sum of mask should be equal to zero. More weight means more edge detection. Prewitt operator provides us two masks one for detecting edges in horizontal direction and another for detecting edges in an vertical direction. Vertical direction -1 0 1 -1 0 1 -1 0 1 Above mask will find the edges in vertical direction and it is because the zeros column in the vertical direction. When you will convolve this mask on an image, it will give you the vertical edges in an image. How it works When we apply this mask on the image it prominent vertical edges. It simply works like as first order derivate and calculates the difference of pixel intensities in a edge region. As the center column is of zero so it does not include the original values of an image but rather it calculates the difference of right and left pixel values around that edge. This increase the edge intensity and it become enhanced comparatively to the original image. Horizontal Direction -1 -1 -1 0 0 0 1 1 1 Above mask will find edges in horizontal direction and it is because that zeros column is in horizontal direction. When you will convolve this mask onto an image it would prominent horizontal edges in the image. How it works This mask will prominent the horizontal edges in an image. It also works on the principle of above mask and calculates difference among the pixel intensities of a particular edge. As the center row of mask is consist of zeros so it does not include the original values of edge in the image but rather it calculate the difference of above and below pixel intensities of the particular edge. Thus increasing the sudden change of intensities and making the edge more visible. Both the above masks follow the principle of derivate mask. Both masks have opposite sign in them and both masks sum equals to zero. The third condition will not be applicable in this operator as both the above masks are standardize and we can’t change the value in them. Now it’s time to see these masks in action: Sample Image Following is a sample picture on which we will apply above two masks one at time. After applying Vertical Mask After applying vertical mask on the above sample image, following image will be obtained. This image contains vertical edges. You can judge it more correctly by comparing with horizontal edges picture. After applying Horizontal Mask After applying horizontal mask on the above sample image, following image will be obtained. Comparison As you can see that in the first picture on which we apply vertical mask, all the vertical edges are more visible than the original image. Similarly in the second picture we have applied the horizontal mask and in result all the horizontal edges are visible. So in this way you can see that we can detect both horizontal and vertical edges from an image. Print Page Previous Next Advertisements ”;

DIP – Convolution theorm

Convolution Theorem ”; Previous Next In the last tutorial, we discussed about the images in frequency domain. In this tutorial, we are going to define a relationship between frequency domain and the images(spatial domain). For example Consider this example. The same image in the frequency domain can be represented as. Now what’s the relationship between image or spatial domain and frequency domain. This relationship can be explained by a theorem which is called as Convolution theorem. Convolution Theorem The relationship between the spatial domain and the frequency domain can be established by convolution theorem. The convolution theorem can be represented as. It can be stated as the convolution in spatial domain is equal to filtering in frequency domain and vice versa. The filtering in frequency domain can be represented as following: The steps in filtering are given below. At first step we have to do some pre – processing an image in spatial domain, means increase its contrast or brightness Then we will take discrete Fourier transform of the image Then we will center the discrete Fourier transform, as we will bring the discrete Fourier transform in center from corners Then we will apply filtering, means we will multiply the Fourier transform by a filter function Then we will again shift the DFT from center to the corners Last step would be take to inverse discrete Fourier transform, to bring the result back from frequency domain to spatial domain And this step of post processing is optional, just like pre processing , in which we just increase the appearance of image. Filters The concept of filter in frequency domain is same as the concept of a mask in convolution. After converting an image to frequency domain, some filters are applied in filtering process to perform different kind of processing on an image. The processing include blurring an image, sharpening an image e.t.c. The common type of filters for these purposes are: Ideal high pass filter Ideal low pass filter Gaussian high pass filter Gaussian low pass filter In the next tutorial, we will discuss about filter in detail. Print Page Previous Next Advertisements ”;

DIP – Introduction to Color Spaces

Introduction to Color Spaces ”; Previous Next In this tutorial, we are going to talk about color spaces. What are color spaces? Color spaces are different types of color modes, used in image processing and signals and system for various purposes. Some of the common color spaces are: RGB CMY’K Y’UV YIQ Y’CbCr HSV RGB RGB is the most widely used color space, and we have already discussed it in the past tutorials. RGB stands for red green and blue. What RGB model states, that each color image is actually formed of three different images. Red image, Blue image, and black image. A normal grayscale image can be defined by only one matrix, but a color image is actually composed of three different matrices. One color image matrix = red matrix + blue matrix + green matrix This can be best seen in this example below. Applications of RGB The common applications of RGB model are Cathode ray tube (CRT) Liquid crystal display (LCD) Plasma Display or LED display such as a television A compute monitor or a large scale screen CMYK RGB to CMY conversion The conversion from RGB to CMY is done using this method. Consider you have an color image , means you have three different arrays of RED, GREEN and BLUE. Now if you want to convert it into CMY, here’s what you have to do. You have to subtract it by the maximum number of levels – 1. Each matrix is subtracted and its respective CMY matrix is filled with result. Y’UV Y’UV defines a color space in terms of one luma (Y’) and two chrominance (UV) components. The Y’UV color model is used in the following composite color video standards. NTSC ( National Television System Committee) PAL (Phase Alternating Line) SECAM (Sequential couleur a amemoire, French for “sequential color with memory) Y’CbCr Y’CbCr color model contains Y’, the luma component and cb and cr are the blue-difference and red difference chroma components. It is not an absolute color space. It is mainly used for digital systems Its common applications include JPEG and MPEG compression. Y’UV is often used as the term for Y’CbCr, however they are totally different formats. The main difference between these two is that the former is analog while the later is digital. Print Page Previous Next Advertisements ”;

DIP – Fourier series and Transform

Fourier Series and Transform ”; Previous Next In the last tutorial of Frequency domain analysis, we discussed that Fourier series and Fourier transform are used to convert a signal to frequency domain. Fourier Fourier was a mathematician in 1822. He give Fourier series and Fourier transform to convert a signal into frequency domain. Fourier Series Fourier series simply states that, periodic signals can be represented into sum of sines and cosines when multiplied with a certain weight.It further states that periodic signals can be broken down into further signals with the following properties. The signals are sines and cosines The signals are harmonics of each other It can be pictorially viewed as In the above signal, the last signal is actually the sum of all the above signals. This was the idea of the Fourier. How it is calculated Since as we have seen in the frequency domain, that in order to process an image in frequency domain, we need to first convert it using into frequency domain and we have to take inverse of the output to convert it back into spatial domain. That’s why both Fourier series and Fourier transform has two formulas. One for conversion and one converting it back to the spatial domain. Fourier series The Fourier series can be denoted by this formula. The inverse can be calculated by this formula. Fourier transform The Fourier transform simply states that that the non periodic signals whose area under the curve is finite can also be represented into integrals of the sines and cosines after being multiplied by a certain weight. The Fourier transform has many wide applications that include, image compression (e.g JPEG compression), filtering and image analysis. Difference between Fourier series and transform Although both Fourier series and Fourier transform are given by Fourier , but the difference between them is Fourier series is applied on periodic signals and Fourier transform is applied for non periodic signals Which one is applied on images Now the question is that which one is applied on the images , the Fourier series or the Fourier transform. Well, the answer to this question lies in the fact that what images are. Images are non – periodic. And since the images are non periodic, so Fourier transform is used to convert them into frequency domain. Discrete fourier transform Since we are dealing with images, and in fact digital images, so for digital images we will be working on discrete fourier transform Consider the above Fourier term of a sinusoid. It include three things. Spatial Frequency Magnitude Phase The spatial frequency directly relates with the brightness of the image. The magnitude of the sinusoid directly relates with the contrast. Contrast is the difference between maximum and minimum pixel intensity. Phase contains the color information. The formula for 2 dimensional discrete Fourier transform is given below. The discrete Fourier transform is actually the sampled Fourier transform, so it contains some samples that denotes an image. In the above formula f(x,y) denotes the image, and F(u,v) denotes the discrete Fourier transform. The formula for 2 dimensional inverse discrete Fourier transform is given below. The inverse discrete Fourier transform converts the Fourier transform back to the image Consider this signal Now we will see an image, whose we will calculate FFT magnitude spectrum and then shifted FFT magnitude spectrum and then we will take Log of that shifted spectrum. Original Image The Fourier transform magnitude spectrum The Shifted Fourier transform The Shifted Magnitude Spectrum Print Page Previous Next Advertisements ”;

DIP – Concept of Dithering

Concept of Dithering ”; Previous Next In the last two tutorials of Quantization and contouring, we have seen that reducing the gray level of an image reduces the number of colors required to denote an image. If the gray levels are reduced two 2, the image that appears doesnot have much spatial resolution or is not very much appealing. Dithering Dithering is the process by which we create illusions of the color that are not present actually. It is done by the random arrangement of pixels. For example. Consider this image. This is an image with only black and white pixels in it. Its pixels are arranged in an order to form another image that is shown below. Note at the arrangement of pixels has been changed, but not the quantity of pixels. Why Dithering? Why do we need dithering, the answer of this lies in its relation with quantization. Dithering with quantization When we perform quantization, to the last level, we see that the image that comes in the last level (level 2) looks like this. Now as we can see from the image here, that the picture is not very clear, especially if you will look at the left arm and back of the image of the Einstein. Also this picture does not have much information or detail of the Einstein. Now if we were to change this image into some image that gives more detail then this, we have to perform dithering. Performing dithering First of all, we will work on threholding. Dithering is usually working to improve thresholding. During threholding, the sharp edges appear where gradients are smooth in an image. In thresholding, we simply choose a constant value. All the pixels above that value are considered as 1 and all the value below it are considered as 0. We got this image after thresholding. Since there is not much change in the image, as the values are already 0 and 1 or black and white in this image. Now we perform some random dithering to it. Its some random arrangement of pixels. We got an image that gives slighter of the more details, but its contrast is very low. So we do some more dithering that will increase the contrast. The image that we got is this: Now we mix the concepts of random dithering, along with threshold and we got an image like this. Now you see, we got all these images by just re-arranging the pixels of an image. This re-arranging could be random or could be according to some measure. Print Page Previous Next Advertisements ”;

DIP – Pixels Dots and Lines per inch

Pixels, Dots and Lines Per Inch ”; Previous Next In the previous tutorial of spatial resolution , we discussed the brief introduction of PPI, DPI, LPI. Now we are formally going to discuss all of them. Pixels per inch Pixel density or Pixels per inch is a measure of spatial resolution for different devices that includes tablets, mobile phones. The higher is the PPI, the higher is the quality. In order to more understand it, that how it calculated. Lets calculate the PPI of a mobile phone. Calculating pixels per inch (PPI) of Samsung galaxy S4: The Samsung galaxy s4 has PPI or pixel density of 441. But how does it is calculated? First of all we will Pythagoras theorem to calculate the diagonal resolution in pixels. It can be given as: Where a and b are the height and width resolutions in pixel and c is the diagonal resolution in pixels. For Samsung galaxy s4, it is 1080 x 1920 pixels. So putting those values in the equation gives the result C = 2202.90717 Now we will calculate PPI PPI = c / diagonal size in inches The diagonal size in inches of Samsung galaxy s4 is 5.0 inches , which can be confirmed from anywhere. PPI = 2202.90717/5.0 PPI = 440.58 PPI = 441 (approx) That means that the pixel density of Samsung galaxy s4 is 441 PPI. Dots per inch. The dpi is often relate to PPI, whereas there is a difference between these two. DPI or dots per inch is a measure of spatial resolution of printers. In case of printers, dpi means that how many dots of ink are printed per inch when an image get printed out from the printer. Remember, it is not necessary that each Pixel per inch is printed by one dot per inch. There may be many dots per inch used for printing one pixel. The reason behind this that most of the color printers uses CMYK model. The colors are limited. Printer has to choose from these colors to make the color of the pixel whereas within pc, you have hundreds of thousands of colors. The higher is the dpi of the printer, the higher is the quality of the printed document or image on paper. Usually some of the laser printers have dpi of 300 and some have 600 or more. Lines per inch When dpi refers to dots per inch, liner per inch refers to lines of dots per inch. The resolution of halftone screen is measured in lines per inch. The following table shows some of the lines per inch capacity of the printers. Printer LPI Screen printing 45-65 lpi Laser printer (300 dpi) 65 lpi Laser printer (600 dpi) 85-105 lpi Offset Press (newsprint paper) 85 lpi Offset Press (coated paper) 85-185 lpi Print Page Previous Next Advertisements ”;

DIP – Quick Guide

DIP – Quick Guide ”; Previous Next Digital Image Processing Introduction Introduction Signal processing is a discipline in electrical engineering and in mathematics that deals with analysis and processing of analog and digital signals , and deals with storing , filtering , and other operations on signals. These signals include transmission signals , sound or voice signals , image signals , and other signals e.t.c. Out of all these signals , the field that deals with the type of signals for which the input is an image and the output is also an image is done in image processing. As it name suggests, it deals with the processing on images. It can be further divided into analog image processing and digital image processing. Analog image processing Analog image processing is done on analog signals. It includes processing on two dimensional analog signals. In this type of processing, the images are manipulated by electrical means by varying the electrical signal. The common example include is the television image. Digital image processing has dominated over analog image processing with the passage of time due its wider range of applications. Digital image processing The digital image processing deals with developing a digital system that performs operations on an digital image. What is an Image An image is nothing more than a two dimensional signal. It is defined by the mathematical function f(x,y) where x and y are the two co-ordinates horizontally and vertically. The value of f(x,y) at any point is gives the pixel value at that point of an image. The above figure is an example of digital image that you are now viewing on your computer screen. But actually , this image is nothing but a two dimensional array of numbers ranging between 0 and 255. 128 30 123 232 123 321 123 77 89 80 255 255 Each number represents the value of the function f(x,y) at any point. In this case the value 128 , 230 ,123 each represents an individual pixel value. The dimensions of the picture is actually the dimensions of this two dimensional array. Relationship between a digital image and a signal If the image is a two dimensional array then what does it have to do with a signal? In order to understand that , We need to first understand what is a signal? Signal In physical world, any quantity measurable through time over space or any higher dimension can be taken as a signal. A signal is a mathematical function, and it conveys some information. A signal can be one dimensional or two dimensional or higher dimensional signal. One dimensional signal is a signal that is measured over time. The common example is a voice signal. The two dimensional signals are those that are measured over some other physical quantities. The example of two dimensional signal is a digital image. We will look in more detail in the next tutorial of how a one dimensional or two dimensional signals and higher signals are formed and interpreted. Relationship Since anything that conveys information or broadcast a message in physical world between two observers is a signal. That includes speech or (human voice) or an image as a signal. Since when we speak , our voice is converted to a sound wave/signal and transformed with respect to the time to person we are speaking to. Not only this , but the way a digital camera works, as while acquiring an image from a digital camera involves transfer of a signal from one part of the system to the other. How a digital image is formed Since capturing an image from a camera is a physical process. The sunlight is used as a source of energy. A sensor array is used for the acquisition of the image. So when the sunlight falls upon the object, then the amount of light reflected by that object is sensed by the sensors, and a continuous voltage signal is generated by the amount of sensed data. In order to create a digital image , we need to convert this data into a digital form. This involves sampling and quantization. (They are discussed later on). The result of sampling and quantization results in an two dimensional array or matrix of numbers which are nothing but a digital image. Overlapping fields Machine/Computer vision Machine vision or computer vision deals with developing a system in which the input is an image and the output is some information. For example: Developing a system that scans human face and opens any kind of lock. This system would look something like this. Computer graphics Computer graphics deals with the formation of images from object models, rather then the image is captured by some device. For example: Object rendering. Generating an image from an object model. Such a system would look something like this. Artificial intelligence Artificial intelligence is more or less the study of putting human intelligence into machines. Artificial intelligence has many applications in image processing. For example: developing computer aided diagnosis systems that help doctors in interpreting images of X-ray , MRI e.t.c and then highlighting conspicuous section to be examined by the doctor. Signal processing Signal processing is an umbrella and image processing lies under it. The amount of light reflected by an object in the physical world (3d world) is pass through the lens of the camera and it becomes a 2d signal and hence result in image formation. This image is then digitized using methods of signal processing and then this digital image is manipulated in digital image processing. Signals and Systems Introduction This tutorial covers the basics of signals and system necessary for understanding the concepts of digital image processing. Before going into the detail concepts , lets first define the simple terms. Signals In electrical engineering, the fundamental quantity of representing some information is called a signal. It doesnot matter what the information is i-e: Analog or digital information. In mathematics, a signal is a function