[ptx] Please help me on SIFT detection by David Lowe's IJCV paper

Tue Mar 2 03:27:55 GMT 2004

Hi Fang,

On Tue, Mar 02, 2004 at 10:56:52AM +0800, Xianyong Fang wrote:

> I have implemented the SIFT detection algorithm of David Lowe's IJCF paper
> 'Distinctive Image Features from Scale-Invariant Keypoints', but why I can
> not find many initial keypoints(that is the peaks in DOG images). Even I find
> the initial keypoints do not always occur in the corners and edges.

Great to see different implementation, I am very curious about it :)

> According to my understanding, the resampled guassian image is used as the
> first scale image of the next octave, am I right? I build three DOG images in
> each octave (two DOG images have the same problem), but find that only few

The DoG generation is not explained in detail in Lowe's paper. He gives
specifics how to speed the process up, but I found it very generic about the
whole scale space processing. After some fiddling I got around what I think
Lowe meant: three DoG to search for peaks means you have to know 5 DoG maps and
search the middle three DoG's. That means you have to calculate 5+1 gaussian
blurred maps to build the DoGs, to build the differences.

>From ScaleSpace.cs of my implementation:
        // Generate DoG maps. The maps are organized like this:
	//    0: D(sigma)
       	//    1: D(k * sigma)
	//    2: D(k^2 * sigma) 
	//   ...
	//    s: D(k^s * sigma) = D(2 * sigma)
	//  s+1: D(k * 2 * sigma)
	//      
	// So, we can start peak searching at 1 to s, and have a DoG map into
	// each direction.

Then, the DoG map at the s'th place can be recycled by simple halfing of the
dimensions for the next octave.

> (actually less that 10) initial key points can be detected. But if I compare
> each sample points in each DOG image independently(that is, to every
> position, only when the value in each DOG image of the same octave is the
> maximum or minimum among it's eight neighbors of the same DOG image it is
> selected as the initial key points) , I can find many more initial key
> points(more that 100). So I don't know why I get so few key points with the
> original method metioned in Lowe's paper? Are there some mistakes made by
> myself? 

I use the strong "lower" and "higher" comparisons and test all layers: the same
layer, the layer above and below.

> Even I can not understand that why we can select initial key points from DOG
> images by the original method. In my mind, the middle DOG image's intensity
> should lie between the upper and lower DOG images. So if we use the value in
> the middle DOG image to compare with all its neighboring values( including
> the upper and the lower DOG images) it is hard to get the maximum or minimum
> position.

I know too less about this to say something here :-/ It just works for me.

> Xianyong

ciao,
Sebastian

-- 
nowozin at cs.tu-berlin.de --- http://user.cs.tu-berlin.de/~nowozin/