About Scale Space Edges:

NOTE: The following description is an incomplete draft. Last modified 2/1/02.

Abstract:

An edge detector that locates and characterizes scale space edges in the luminosity of a 2-D image is described. The edge definition, developed by Lindeberg (1998), adaptively selects appropriate scale(s) at each point in the image. The sparse scale space search and edge following algorithm is designed to flexibly limit computational resource needs and preserve edge connectivity for image segmentation purposes.  The implementation generates a list of edge segments (ordered sets of edge points), nodes (intersections of edge segments) and edge point metrics. No smoothing or threshold parameters need to be preselected as salience decisions are made in a post processing step.

Goal:

The primary goal of this image edge detector is to delineate paths that correspond with the physical boundaries and other features of the image's subject.  Several factors make this correspondence difficult to achieve;

1) Physical boundaries of interest may result in low or variable image contrast.

2) Noise, unresolved transitions, and artifacts of the imaging process introduce non-physical image edges, and displacements/breaks in detected edges.

3) Details of the physical boundaries and imaging methods result in edges with different/variable characteristic widths (scales).  Edge information from an unknown range of scales is necessary for a complete description.  At a single scale, edge information may be discontinuous or not accurately localized.

4) Closely spaced roughly parallel image edges interfere to some degree, depending on the edge operator extent and shape.

5) High curvature edges, compared to the edge operator extent, can result in poor localization.

6) There is no intrinsic information in the image allowing decisions as to which edge segments are to be grouped together (are due to the same physical cause), or which segments are of interest (due to noise, shading, texture, effects of the imaging process, or other image transitions deemed secondary).

An end use of the edge descriptors is segmentation, so continuity of edge segments is of primary importance.  Multiple edge segments must be associated and connected, in a rational way, to define a region.  The more continuous the edge segments are (provided the continuity is not artificially introduced), the simpler and more robust the segmentation process will be . Continuity of physical boundaries is the norm (materials rarely blend smoothly), so maximal continuity is a valid a priori assumption in most instances of edge detection.

A secondary goal is to design the implementation of edge operators and the search algorithm so that they can be tailored to particular tasks.  For example, derivative estimation should be adaptable to various imaging models, known properties of the physical object can inform the search strategy, and the implementation should  be adaptable to a 3-D (spatial) edge detector.
 

Edge definition:

The algorithm implements an edge definition developed in Lindeberg (1998) that results in "automatic scale selection":

i)   The gradient magnitude is a local maximum (in the direction of the gradient), and
ii)  A normalized measure of the strength of the edge response (for example, the gradient weighted by the scale) is locally maximum over scales.

The first condition is a well established technique (Canny, 1986).  Accounting for edge information over a range of scales (multi scale edge detection) can be approached in two ways;  appropriate scale(s) at each point can be evaluated, or edge information over a range of scales can be combined.  The above definition  takes the first approach, where appropriate scale(s) is taken to mean the scale(s) at which maximal information about the image is present.  In this sense it is an adaptive filtering technique -- the edge operator is chosen based on the local image structure.

As Lindeberg shows, this definition can be interpreted as the intersection of maximal surfaces in scale space, and, to the extent that the gradient and strength operators are smooth through scale space, their intersection (edge segments) will be also be smooth and continuous.

This definition has several appealing qualities, particularly with respect to the difficulties mentioned above:

- Edges are locally continuous across scales (edge widths can vary along segments).  While many natural images are primarily composed of sharp (unresolved, small scale) edges, even these vary in scale near the resolution limits due to the imaging process.

- Auxiliary information (gradient, scale of maximal response, strength of edge response) is available that can be used to characterize the salience of edge segments.  This information can also be used to group edge segments for image segmentation purposes.

- The "maximal over scales" condition favors the integrity and continuity of edge points that have a strong response over a range of adjacent scales, and the edges' persistence and stability over scales can be directly estimated.  This is particularly valuable for segmentation purposes, as edges that persist over a range of scales are more likely to correspond with physical boundaries (Marr, 1981?).

- Good localization can be achieved by defining the "normalized strength of response" appropriately.  This is achieved by normalizing the strength metric such that the maximal response occurs at an appropriately small scale, and therefore a small edge operator extent.

- A maximal amount of information is encoded in a minimal number of edge points.  This provides a rational basis for assigning salience measures (estimate of degree of correspondence of image edge points with physical boundaries) of segments, or portions of segments.

Another attractive feature of the definition is that it can naturally be adapted for other feature extraction tasks that depend on derivative operators and scale.  Lindeberg has demonstrated this with ridge, blob, and junction detectors (Lindeberg 1997).

Elder et. al. (1998) uses different "automatic scale selection" algorithm that calculates a local "minimum reliable scale", which varies with the noise characteristics across an image, and achieves some of the same benefits.
 

Edge operator implementation:

Scale Space Edges implements the above Lindeberg definition using a gaussian blurred representation of scale space, estimates gradient maxima locations using second derivative (in the gradient direction) zero crossings, and defines the normalized strength of edge response as ((scale^1/2 )*gradient magnitude).  Third derivatives can be used for the edge response, but smoothness considerations at small scales favors a first derivative measure.  A different normalization scale power could also be used, but 1/2 has a natural interpretation with respect to an ideal gaussian step edge.

Oriented first and second, and optionally third, order Discrete Difference of Gaussian (DoG) filters are used as derivative estimation operators.  The DoG filters operate on a small set of resampled and gaussian blurred images ("sample planes") to mimic derivative of gaussian operations at an arbitrary scale: the standard deviation of the DoG filter and the filter used to generate the sample plane combine to match the width of the filter.

This derivative estimation method is designed to minimize resource requirements while maintaining smoothness across scale space.  A bank of filters, an orthogonal pair (first order)  or one for each orientation increment (second order), at each scale increment between [1 2], is precalculated.  A scale space sample plane, with a gaussian blur matched to the filters, is precalculated for each power of two in scale (1, 2, 4, ... 2^k_max). Derivatives at large scales ( >2 ) are estimated with a small scale filter intercalated with zeros (effectively).  The filters have wide support to satisfy smoothness in space conditions, and matching of filter/sample plane standard deviation improves smoothness across scales.

The net result is that each derivative estimation takes a fixed number of processor cycles, and only the filter bank and a small set of sample planes, not a full scale space representation, needs to be accommodated in memory.  This is particularly relevant for a planned extension to volume images (3-D spatial), where memory limitations make a full representation scheme less practical.
 

Algorithm:

Lindeberg used a global algorithm to find the edge points (locations where the defined surfaces intersect), requiring finding derivatives at every point and at every scale, and the resulting edge points were grouped into segments using a connected components algorithm.  But the edge definition naturally allows edge following algorithms that don't visit every point of scale space and use the gradient information to extrapolate adjacent edge point locations.

All edges, due to noise, imaging artifacts, and the subject, are identified.  Edge information suitable for discriminating the source of edges, salience of edge points, and segment smoothing, is derived for use in post processing.  This allows for discrimination based on connectivity of segments, as well as local edge metrics.

The Scale Space Edges implementation iteratively climbs the scale space gradient until an edge point is located and then iteratively steps, perpendicular the the gradient, along a segment, repeating the gradient climb, to fully extract an ordered edge segment.  (See Appendix for pseudo code.) The initial points are sparsely distributed in scale space and the necessary derivatives are estimated as needed, so the total computational demand can be reduced and tailored.

The advantage of using this algorithm, compared to a global search algorithm, is in its use of computational resources, flexibility of choosing the search space, and flexibility of specifying the details of its edge finding/following.  It also groups and orders edge points, reducing post processing requirements.

The search strategy is informed by geometric constraints on edge segments.  In general gradient maxima don't co-occur within a factor of two or more in scale, and gradient maxima don't decay within a factor of two in scale.  This means that it is sufficient to search at a single scale (on a scale plane, following each identified edge segment throughout scale space) within every factor of two in scale.  Also, at a single scale (on a scale plane), edge segments do not approach each other to within their width (the scale), in the gradient direction. These two conditions governing the separation of segments in scale space greatly reduce the number of initial points, and total points, that need to be examined to guarantee finding all edge segments/points.  For a search of an n x n image, ~2*n^2 points need examination, regardless of the scale resolution or range of scales.  A full scale space contains a many times more points: a scale space with a scale range of [1 2^5] and scale resolution of 2^(1/10), for example, contains 5*10*n^2 points.

While edge following algorithms can be complex (they can be implemented with a variety of differing details regarding  segment tracking, identifying visited neighbors, stepping along segment etc.), and are often avoided because of this complexity, in this case the flexibility in designing the details augments it's behavior.  While the edge operator is designed to be smooth  across scale space, edge segments may be artificially discontinuous near very low second order slopes (low third derivatives), high edge segment curvatures, and/or scale transitions.  Heuristic strategies for locating edge points not directly adjacent to a current edge point improves connectivity of segments.  Also, edge segments commonly end near "Y" and "T" junctions and the edge following framework can be readily adapted to detect and complete these intersections.

The flexibility in choosing the pattern of initial points allows selection of the order in which segments are located:

- Large scale features, often defined by strong small scale edges but extending into large scales, can be quickly located with a very sparse search at large scales (and subsequently tracked to small scales).  These are exactly the kind of primary image transitions that dominate visual attention in static images.

- Information about the grouping of small scale features can be gleaned by the continuity of the corresponding features over a limited range of higher scales.

- While this edge definition limits artifactual edges due to interference between edges, some geometries, particularly "staircase" edges, result in large scale edges that are (mostly) accounted for by the small scale "steps".  (These are not artifacts, as the large scale edges accurately reflect the intensity transition, but they duplicate information at small scales.) By searching small scales first, these segment can be recognized and eliminated if desired.

Different derivative operators can be easily accommodated, but the algorithm depends on their smoothness in space and scale.  Modifications to the shape and 2-D extent of the filters can maintain this smoothness while improving derivative estimation properties.  For example the filters could be "windowed" in the gradient direction to reduce interference from parallel intensity transitions.

A disadvantage of the algorithm is that the derivatives can't be calculated as efficiently as recursive (IIR) filtering techniques that can be used on the full scale space.  This limit to efficiency extends to potential parallelized implementations, although the efficient use of multiple processors is not precluded. Also, each edge point requires many derivative calculations (convolutions with a wide support filter), and, as implemented, these are often duplicated.  The duplication can be reduced, but at the expense of memory resources.
 

Results, Discussion:

... testing and validation needs to be performed with rigorous criteria ...

... simple thresholding based on integrated strength and length of segments, and scale properties appears to work well, but using more sophisticated cleaning  algorithms (e.g. hysterisis thresholds), edge smoothing, and information based on segment connectivity will improve selection of salient segments for particular purposes ...

Several improvements to the implementation are evident.  For example, to approximate accurate filters at small (subpixel) scales and locations, the image is oversampled using a chosen interpolator and the full support filters for scales between [1 2] are used.  While this is expedient and shouldn't impact the quality of edge extraction, better methods are feasible.  In particular, a set of filters designed for subpixel locations and tuned for a particular interpolation method would decrease memory requirements (for sample scale planes) considerably, and processing time to some degree, at the expense of a larger filter bank and added complexity.  Also many optimizations, both with respect to processing time and tracking behavior, can be made.

....
 

References:

T. Lindeberg  "Feature detection with automatic scale selection", Int. J. of Computer Vision, vol 30, number 2, pp 77-154, 1998.

Marr, D. "Vision" W.H. Freeman and Company, San Francisco, 1981?. Chapter 2.

Marr, D. and E. Hildreth "Theory of Edge Detection." Proc. R. Soc. Lond. B 207. 1980. p197.

J.H. Elder and S.W. Zucker "Local Scale Control for Edge Detection and Blur Estimation"
IEEE Transactions on Pattern Analysis and Machine Intelligence. col.20, No. 7. July 1998.
 

Appendix:

Algorithm pseudo code:

For each point of a sparse grid in scale space {

While not finished finding an edge segment in both directions {
While edge point not found {
Follow gradient (at current scale) until it reaches a maximum

If at border of image, already marked as found edge point, or no neighboring maximum {

Finished finding edge segment in this direction
}

Move to adjacent scale if it has stronger response

If at gradient maximum {

Edge point found, mark scale space map
}
}

Step to neighbors, in both directions (perpendicular to gradient)

}
}
 
 

Mark Dow, January 2002.  Feel free to email me.