Tyler C. Folsom

Abstract: Hubel and Wiesel received the Nobel prize for describing how individual cells in the cat brain respond to light. Other researchers have derived mathematical models that describe brain cells as linear filters. My work investigates how these filters could be used in machine vision.

The simple cells of cat visual cortex act predominantly as linear filters, and their behavior, with respect to visual stimuli, can be characterized by their impulse responses, which are called "receptive fields". [1] [2] Idealized cortical receptive fields can be represented as odd or even oriented filters, shown in Figure 1. Hubel and Wiesel [3] found that these brain cells are sensitive to bars of light at various orientations. A particular cell might respond strongly to a vertically oriented bar and not at all to a horizontal bar. Some researchers have suggested that the odd filters are edge detectors, and the even filters are bar detectors. For example, the H90 even filter of Figure 1 gives a strong response when correlated with the centered vertical bar at the left.

The filter gives no response when correlated with

 a uniform image or a horizontal bar or a centered vertical edge.

However, in order to deduce anything about the input, we must examine the outputs of several cortical filters. The output from a single cortical cell depends on the type of input, its contrast, position, orientation, and size. The output of a single cortical filter is ambiguous. The H90 filter can produce identical results when correlated with any of the following:

An algorithm that resolves this ambiguity, by using steerable quadrature filters, is called "quadrature disambiguation". Assume that, at the scale of interest, the portion of the image within the receptive field consists of an edge or a bar (perhaps with contrast = 0). We should like to know:

• Is it an edge or a bar?
• Which part is dark and which light?
• The orientation of the feature.
• The position of the feature.
• The feature contrast.
• The width of a bar.

The operation of the algorithm is illustrated in Figure 1. A circular portion of an image is correlated with even and odd steerable filters at several orientations [4]. In the example, five filters are used. Since the filters are steerable, their response to a stimulus, at any orientation, can be interpolated from the responses at these orientations.

(1)

(2)

Since the filters are in quadrature, we can consider the magnitude of their response to a stimulus:

(3)

The only variable in the above equation is q. We can take the derivative with respect to q, set the equation to zero, and solve numerically. In the example, we find that the maximum response will occur when the even and odd filters are oriented at 70.2° . We then use the steering equations (1) and (2) to compute the responses of even and odd filters at this orientation.

The next stage is to compute the phase of the steered filters:

(4)

The phase gives the position of an edge relative to the center of the receptive field as:

Position of an edge = k D (f ± p/2),

where f is the phase, D is the diameter of the receptive field, and k is a constant.

Thus, if we assume that the stimulus is an edge, the algorithm has determined that it is oriented at 70.2° and located at 7.12 pixels from the center of the circle. This position is shown in Figure 1. We know how the magnitude of filter response rolls off as an edge moves farther from the center. Using this information and the magnitude M, we can compute what the filter response would have been had this edge been centered in the receptive field. This gives the contrast as 38% (where mid-gray to full black is 100%).

Suppose that we do not know whether the stimulus was an edge or a bar. Then, we select a portion of the image with the same center as that in Figure 1, but with twice the diameter. When we correlate it with five filters at this scale, we obtain:

G¢ 90° = 125,607

G¢ 0° = 42,330

H¢ 90° = -57,437

H¢ 30° = -36,696

H¢ 150° = -3,265

We apply the steering Equations (1) and (2) and find:

G¢ 70.2° = 131,041

H¢ 70.2° = -64,010

The phase is found as:

= 2.03

The positions predicted for the stimulus from the phase will vary based on the scale of the filter and whether it is an edge, dark bar, or light bar.

Center of a dark bar = k D f

Center of a light bar = k D (f ± p)

We compute the positions predicted by the small and large filters in Table 1.

Table 1. Feature positions from different filter sizes.

 Predicted position of: Small filters Large filters Edge 7.12 6.57 Dark bar 12.20 16.87 Light bar -2.34 -8.12

The predictions from different filter sizes agree for an edge. Thus, we conclude that the stimulus is an edge located at 7.12 pixels from the center of the receptive field.

If the stimulus had been a bar, the ratio M/M¢ could be used to determine the bar width.

By processing the responses of cortical filters, we have been able to characterize a portion of an image as follows:

An edge,

oriented at 70.2° ,

dark side to left, light side to right,

contrast is 38% of mid-gray to full black,

position is 7.12 pixels from center.

This is a correct analysis of the region. The ripples in the water that would confuse many edge detectors are completely ignored. The quadrature disambiguation algorithm is designed to reject clutter. More details can be found in [5] and [6].