main-content

This revised and expanded new edition of an internationally successful classic presents an accessible introduction to the key methods in digital image processing for both practitioners and teachers. Emphasis is placed on practical application, presenting precise algorithmic descriptions in an unusually high level of detail, while highlighting direct connections between the mathematical foundations and concrete implementation.
The text is supported by practical examples and carefully constructed chapter-ending exercises drawn from the authors' years of teaching experience, including easily adaptable Java code and completely worked out examples. Source code, test images and additional instructor materials are also provided at an associated website. Digital Image Processing is the definitive textbook for students, researchers, and professionals in search of critical analysis and modern implementations of the most important algorithms in the field, and is also eminently suitable for self-study.

### 1. Digital Images

For a long time, using a computer to manipulate a digital image (i.e., digital image processing) was something performed by only a relatively small group of specialists who had access to expensive equipment. Usually this combination of specialists and equipment was only to be found in research labs, and so the field of digital image processing has its roots in the academic realm. Now, however, the combination of a powerful computer on every desktop and the fact that nearly everyone has some type of device for digital image acquisition, be it their cell phone camera, digital camera, or scanner, has resulted in a plethora of digital images and, with that, for many digital image processing has become as common as word processing. It was not that many years ago that digitizing a photo and saving it to a file on a computer was a time-consuming task. This is perhaps difficult to imagine given today’s powerful hardware and operating system level support for all types of digital media, but it is always sobering to remember that “personal” computers in the early 1990s were not powerful enough to even load into main memory a single image from a typical digital camera of today. Now powerful hardware and software packages have made it possible for amateurs to manipulate digital images and videos just as easily as professionals.
Wilhelm Burger, Mark J. Burge

### 2. ImageJ

Until a few years ago, the image-processing community was a relatively small group of people who either had access to expensive commercial image-processing tools or, out of necessity, developed their own software packages. Usually such home-brew environments started out with small software components for loading and storing images from and to disk files. This was not always easy because often one had to deal with poorly documented or even proprietary file formats. An obvious (and frequent) solution was to simply design a new image file format from scratch, usually optimized for a particular field, application, or even a single project, which naturally led to a myriad of different file formats, many of which did not survive and are forgotten today [163, 168]. Nevertheless, writing software for converting between all these file formats in the 1980s and early 1990s was an important business that occupied many people. Displaying images on computer screens was similarly difficult, because there was only marginal support from operating systems, APIs, and display hardware, and capturing images or videos into a computer was close to impossible on common hardware. It thus may have taken many weeks or even months before one could do just elementary things with images on a computer and finally do some serious image processing.
Wilhelm Burger, Mark J. Burge

### 3. Histograms and Image Statistics

Histograms are used to depict image statistics in an easily interpreted visual format. With a histogram, it is easy to determine certain types of problems in an image, for example, it is simple to conclude if an image is properly exposed by visual inspection of its histogram. In fact, histograms are so useful that modern digital cameras often provide a real-time histogram overlay on the viewfinder (Fig. 3.1) to help prevent taking poorly exposed pictures. It is important to catch errors like this at the image capture stage because poor exposure results in a permanent loss of information, which it is not possible to recover later using image-processing techniques. In addition to their usefulness during image capture, histograms are also used later to improve the visual appearance of an image and as a “forensic” tool for determining what type of processing has previously been applied to an image. The final part of this chapter shows how to calculate simple image statistics from the original image, its histogram, or the so-called integral image.
Wilhelm Burger, Mark J. Burge

### 4. Point Operations

Point operations perform a modification of the pixel values without changing the size, geometry, or local structure of the image. Each new pixel value $$b\;=\;I^\prime(u,v)$$ depends exclusively on the previous value $$a\;=\;I(u,v)$$ at the same position and is thus independent from any other pixel value, in particular from any of its neighboring pixels.
Wilhelm Burger, Mark J. Burge

### 5. Filters

The essential property of point operations (discussed in the previous chapter) is that each new pixel value only depends on the original pixel at the same position. The capabilities of point operations are limited, however. For example, they cannot accomplish the task of sharpening or smoothing an image (Fig. 5.1). This is what filters can do. They are similar to point operations in the sense that they also produce a 1:1 mapping of the image coordinates, that is, the geometry of the image does not change.
Wilhelm Burger, Mark J. Burge

### 6. Edges and Contours

Prominent image “events” originating from local changes in intensity or color, such as edges and contours, are of high importance for the visual perception and interpretation of images. The perceived amount of information in an image appears to be directly related to the distinctiveness of the contained structures and discontinuities. In fact, edge-like structures and contours seem to be so important for our human visual system that a few lines in a caricature or illustration are often sufficient to unambiguously describe an object or a scene. It is thus no surprise that the enhancement and detection of edges has been a traditional and important topic in image processing as well. In this chapter, we first look at simple methods for localizing edges and then attend to the related issue of image sharpening.
Wilhelm Burger, Mark J. Burge

### 7. Corner Detection

Corners are prominent structural elements in an image and are therefore useful in a wide variety of applications, including following objects across related images (tracking), determining the correspondence between stereo images, serving as reference points for precise geometrical measurements, and calibrating camera systems for machine vision applications. Thus corner points are not only important in human vision but they are also “robust” in the sense that they do not arise accidentally in 3D scenes and, furthermore, can be located quite reliably under a wide range of viewing angles and lighting conditions.
Wilhelm Burger, Mark J. Burge

### 8. Finding Simple Curves: The Hough Transform

In Chapter 6 we demonstrated how to use appropriately designed filters to detect edges in images. These filters compute both the edge strength and orientation at every position in the image. In the following sections, we explain how to decide (e.g., by using a threshold operation on the edge strength) if a curve is actually present at a given image location. The result of this process is generally represented as a binary edge map. Edge maps are considered preliminary results, since with an edge filter’s limited (“myopic”) view it is not possible to accurately ascertain if a point belongs to a true edge. Edge maps created using simple threshold operations contain many edge points that do not belong to true edges (false positives), and, on the other hand, many edge points are not detected and hence are missing from the map (false negatives).
Wilhelm Burger, Mark J. Burge

### 9. Morphological Filters

In the discussion of the median filter in Chapter 5 (Sec. 5.4.2), we noticed that this type of filter can somehow alter 2D image structures. Figure 9.1 illustrates once more how corners are rounded off, holes of a certain size are filled, and small structures, such as single dots or thin lines, are removed. The median filter thus responds selectively to the local shape of image structures, a property that might be useful for other purposes if it can be applied not just randomly but in a controlled fashion. Altering the local structure in a predictable way is exactly what “morphological” filters can do, which we focus on in this chapter.
Wilhelm Burger, Mark J. Burge

### 10. Regions in Binary Images

In a binary image, pixels can take on exactly one of two values. These values are often thought of as representing the “foreground” and “background” in the image, even though these concepts often are not applicable to natural scenes. In this chapter we focus on connected regions in images and how to isolate and describe such structures.
Wilhelm Burger, Mark J. Burge

### 11. Automatic Thresholding

Although techniques based on binary image regions have been used for a very long time, they still play a major role in many practical image processing applications today because of their simplicity and efficiency. To obtain a binary image, the first and perhaps most critical step is to convert the initial grayscale (or color) image to a binary image, in most cases by performing some form of thresholding operation, as described in Chapter 4, Sec. 4.1.4.
Wilhelm Burger, Mark J. Burge

### 12. Color Images

Color images are involved in every aspect of our lives, where they play an important role in everyday activities such as television, photography, and printing. Color perception is a fascinating and complicated phenomenon that has occupied the interests of scientists, psychologists, philosophers, and artists for hundreds of years [211, 217]. In this chapter, we focus on those technical aspects of color that are most important for working with digital color images. Our emphasis will be on understanding the various representations of color and correctly utilizing them when programming. Additional color-related issues, such as colorimetric color spaces, color quantization, and color filters, are covered in subsequent chapters.
Wilhelm Burger, Mark J. Burge

### 13. Color Quantization

The task of color quantization is to select and assign a limited set of colors for representing a given color image with maximum fidelity. Assume, for example, that a graphic artist has created an illustration with beautiful shades of color, for which he applied 150 different crayons. His editor likes the result but, for some technical reason, instructs the artist to draw the picture again, this time using only 10 different crayons. The artist now faces the problem of color quantization—his task is to select a “palette” of the 10 best suited from his 150 crayons and then choose the most similar color to redraw each stroke of his original picture.
Wilhelm Burger, Mark J. Burge

### 14. Colorimetric Color Spaces

In any application that requires precise, reproducible, and deviceindependent presentation of colors, the use of calibrated color systems is an absolute necessity. For example, color calibration is routinely used throughout the digital print work flow but also in digital film production, professional photography, image databases, etc. One may have experienced how difficult it is, for example, to render a good photograph on a color laser printer, and even the color reproduction on monitors largely depends on the particular manufacturer and computer system.
Wilhelm Burger, Mark J. Burge

### 15. Filters for Color Images

Color images are everywhere and filtering them is such a common task that it does not seem to require much attention at all. In this chapter, we describe how classical linear and nonlinear filters, which we covered before in the context of grayscale images (see Ch. 5), can be either used directly or adapted for the processing of color images. Often color images are treated as stacks of intensity images and existing monochromatic filters are simply applied independently to the individual color channels. While this is straightforward and performs satisfactorily in many situations, it does not take into account the vector-valued nature of color pixels as samples taken in a specific, multi-dimensional color space. As we show in this chapter, the outcome of filter operations depends strongly on the working color space and the variations between different color spaces may be substantial. Although this may not be apparent in many situations, it should be of concern if high-quality color imaging is an issue.
Wilhelm Burger, Mark J. Burge

### 16. Edge Detection in Color Images

Edge information is essential in many image analysis and computer vision applications and thus the ability to locate and characterize edges robustly and accurately is an important task. Basic techniques for edge detection in grayscale images are discussed in Chapter 6. Color images contain richer information than grayscale images and it appears natural to assume that edge detection methods based on color should outperform their monochromatic counterparts. For example, locating an edge between two image regions of different hue but similar brightness is difficult with an edge detector that only looks for changes in image intensity. In this chapter, we first look at the use of “ordinary” (i.e., monochromatic) edge detectors for color images and then discuss dedicated detectors that are specifically designed for color images.
Wilhelm Burger, Mark J. Burge

### 17. Edge-Preserving Smoothing Filters

Noise reduction in images is a common objective in image processing, not only for producing pleasing results for human viewing but also to facilitate easier extraction of meaningful information in subsequent steps, for example, in segmentation or feature detection. Simple smoothing filters, such as the Gaussian filter1 and the filters discussed in Chapter 15 effectively perform low-pass filtering and thus remove high-frequency noise. However, they also tend to suppress high-rate intensity variations that are part of the original signal, thereby destroying image structures that are visually important. The filters described in this chapter are “edge preserving” in the sense that they change their smoothing behavior adaptively depending upon the local image structure. In general, maximum smoothing is performed over “flat” (uniform) image regions, while smoothing is reduced near or across edge-like structures, typically characterized by high intensity gradients.
Wilhelm Burger, Mark J. Burge

### 18. Introduction to Spectral Techniques

The following three chapters deal with the representation and analysis of images in the frequency domain, based on the decomposition of image signals into sine and cosine functions using the wellknown Fourier transform. Students often consider this a difficult topic, mainly because of its mathematical flavor and that its practical applications are not immediately obvious. Indeed, most common operations and methods in digital image processing can be sufficiently described in the original signal or image space without even mentioning spectral techniques. This is the reason why we pick up this topic relatively late in this text.
Wilhelm Burger, Mark J. Burge

### 19. The Discrete Fourier Transform in 2D

The Fourier transform is defined not only for 1D signals but for functions of arbitrary dimension. Thus, 2D images are nothing special from a mathematical point of view.
Wilhelm Burger, Mark J. Burge

### 20. The Discrete Cosine Transform (DCT)

The Fourier transform and the DFT are designed for processing complex-valued signals, and they always produce a complex-valued spectrum even in the case where the original signal was strictly realvalued. The reason is that neither the real nor the imaginary part of the Fourier spectrum alone is sufficient to represent (i.e., reconstruct) the signal completely. In other words, the corresponding cosine (for the real part) or sine functions (for the imaginary part) alone do not constitute a complete set of basis functions.
Wilhelm Burger, Mark J. Burge

### 21. Geometric Operations

Common to all the filters and point operations described so far is the fact that they may change the intensity function of an image but the position of each pixel, and thus the geometry of the image, remains the same. The purpose of geometric operations, which are discussed in this chapter, is to deform an image by altering its geometry. Typical examples are shifting, rotating, or scaling images, as shown in Fig. 21.1. Geometric operations are frequently needed in practical applications, for example, in virtually any modern graphical computer interface.
Wilhelm Burger, Mark J. Burge

### 22. Pixel Interpolation

Interpolation is the process of estimating the intermediate values of a sampled function or signal at continuous positions or the attempt to reconstruct the original continuous function from a set of discrete samples. In the context of geometric operations this task arises from the fact that discrete pixel positions in one image are generally not mapped to discrete raster positions in the other image under some continuous geometric transformation T (or T –1, respectively).
Wilhelm Burger, Mark J. Burge

### 23. Image Matching and Registration

When we compare two images, we are faced with the following basic question: when are two images the same or similar, and how can this similarity be measured? Of course one could trivially define two images I 1, I 2 as being identical when all pixel values are the same (i.e., the difference I 1I 2 is zero). Although this kind of definition may be useful in specific applications, such as for detecting changes in successive images under constant lighting and camera conditions, simple pixel differencing is usually too inflexible to be of much practical use. Noise, quantization errors, small changes in lighting, and minute shifts or rotations can all create large numerical pixel differences for pairs of images that would still be perceived as perfectly identical by a human viewer.
Wilhelm Burger, Mark J. Burge

### 24. Non-Rigid Image Matching

The correlation-based registration methods described in Chapter 23 are rigid in the sense that they provide for translation as the only form of geometric transformation and positioning is limited to whole pixel units. In this chapter we look at methods that are capable of registering a reference image under (almost) arbitrary geometric transformations, such as changes in rotation, scale, and affine distortion, and also to sub-pixel accuracy.
Wilhelm Burger, Mark J. Burge

### 25. Scale-Invariant Feature Transform (SIFT)

Many real applications require the localization of reference positions in one or more images, for example, for image alignment, removing distortions, object tracking, 3D reconstruction, etc. We have seen that corner points1 can be located quite reliably and independent of orientation. However, typical corner detectors only provide the position and strength of each candidate point, they do not provide any information about its characteristic or “identity” that could be used for matching. Another limitation is that most corner detectors only operate at a particular scale or resolution, since they are based on a rigid set of filters.
Wilhelm Burger, Mark J. Burge

### 26. Fourier Shape Descriptors

Fourier descriptors are an interesting method for modeling 2D shapes that are described as closed contours. Unlike polylines or splines, which are explicit and local descriptions of the contour, Fourier descriptors are global shape representations, that is, each component stands for a particular characteristic of the entire shape. If one component is changed, the whole shape will change. The advantage is that it is possible to capture coarse shape properties with only a few numeric values, and the level of detail can be increased (or decreased) by adding (or removing) descriptor elements. In the following, we describe what is called “cartesian” (or “elliptical”) Fourier descriptors, how they can be used to model the shape of closed 2D contours and how they can be adapted to compare shapes in a translation-, scale-, and rotation-invariant fashion.
Wilhelm Burger, Mark J. Burge