[ptx] SIFT algorithm question

alexandre jenny alexandre.jenny at le-geo.com
Fri Jan 2 13:54:27 GMT 2004


Yes, that's right. I also get the same obversation, but as you know, people
are always hiding thinks when publishing algorithm. So when you look at the
last paper of lowe called : "distinctive image features from scale-invariant
keypoints", you can see that the DoG stage (difference of gaussian) is a
little bit different, and it tells much more for implementing this stage.
This paper has also many new improvement for the averall algorithm.

Anyway, here is the source code for the sift extractor I rewrote in matlab :
http://www.le-geo.com/temp/sift6.m
It seems to work but I didn't get time to check the last stage of the
algorithm which is the keypoint descriptor construction.

BTW : one think could be very cool : Dr Lowe released a binary linux version
of the keypoint detector here. If someone have linux (I didn't), it could be
cool to compare the result from hi software and from my to see if I'm not
doing it wrong : http://www.cs.ubc.ca/~lowe/keypoints/

For pablo : I cannot provide other part of the algorithm beside sift as I
didn't wrote them yet ;-)

Bye
  Alexandre


> -----Message d'origine-----
> De : ptx-bounces at email-lists.org 
> [mailto:ptx-bounces at email-lists.org] De la part de Sebastian Nowozin
> Envoyé : vendredi 2 janvier 2004 12:50
> À : ptx at email-lists.org
> Objet : [ptx] SIFT algorithm question
> 
> 
> 
> Hi guys (especially the math freaks ;)
> 
> 
> I started implementing the SIFT algorithm, and while I 
> understand the main concept, I have some detail question in 
> the "Key localization" section of the Lowe99 paper 
> (http://citeseer.nj.nec.com/lowe99object.html)> .
> 
> There is a 
> description how to build the image pyramid 
> which I do not quite understand and I think the text is a bit 
> ambiguous:
> 
>   "The input image is first convolved with the Gaussian 
> function using sigma = sqrt(2) to give an image A. This is 
> then repeated a second time with a further incremental 
> smoothing of sigma = sqrt(2) to give a new image, B, which 
> now has an effective smoothin of sigma = 2. The difference of 
> Gaussian function is obtained by subtracting image B from A, 
> resultin in a ratio of 2/sqrt(2) =
> sqrt(2) between the two Gaussians.
>   To generate the next pyramid level, we resample the already 
> smoothed image B using bilinear interpolation with a pixel 
> spacing of 1.5 in each direction. ..."
> 
> 
> >From this and the previous paragraph I think the pyramid contains 
> >always the
> same image, just scaled at a factor of 1.5. Within each 
> pyramid level the image is processed to a difference map 
> between a sigma=sqrt(2) and a sigma=2 gaussian-smoothed 
> image. To save the time to comput the sqrt(2) smoothed image 
> at each level, he recommends the resampling of the smoothed 
> image. Is this observation correct?
> 
> If I implement it like this, all I get (after normalization) 
> are rather thick-lined images. (I put up a screenshot of the 
> pyramid created from an image at 
> http://cs.tu-berlin.de/~nowozin/try1.png).
> 
> Anyone interprets 
> this part of Lowe's paper in a more correct way than me and 
> can tell me what I do wrong?
> 
> 
> Also, it is unclear to me how he threshholds or limits his 
> minima/maxima to a certain number, as if consequently thought 
> to the end, he would have only one remaining feature (when 
> arriving at the top image of the pyramid, having only one pixel).
> 
> 
> bye,
> Sebastian
> 
> -- 
> nowozin at cs.tu-berlin.de --- http://user.cs.tu-berlin.de/~nowozin/
> 



More information about the ptX mailing list