build_vocabulary.m
-
get images
-
extract sift features from images
-
get descriptors from extracted features
-
cluster the descriptors
will find similar features in each image and create visual words for each of it
-
obtain dictionary with visual words.
get_bag_of_sifts.m
-
extract sift features of the image
-
get the descriptor for each point
-
match the feature descriptors with the vocabulary of visual words (vocab.mat)
-
build the histogram with the features descriptors
it will be created with the frequency of each feature in an image each feature will correspond to a visual word in the dictionary
-
the visual words with the highest frequency will is the class of that image (prediction)
visual words -> a set of numbers representing a feature
spatial_pyramid.m
-
get images
-
extract sift features from images
-
get descriptors from extracted features
-
find the minimum distance of the the extracted features and the ons from the already computed vocabulary
D = vl_alldist2(vocab',features)
[~,ind] = min(D);
. -
construct a histogram with those values.
It will be the histogram with SIFT features for Level 0 of the pyramid.
-
Create a matrix with the total levels of the pyramid 6.1 Each level will have a number of quadrants 6.2 Each quadrant will be represented with a histogram of its SIFT features. 6.3 Then each level will have those histograms concantated into a row, for the pyramid.
In will result into a bigger histogram
-
Apply the appropriate weight to each level
- useful lecture: https://youtu.be/iGZpJZhqEME