How to calculate "Average Precision and Ranking" for CBIR system

Question

So, for I have implemented basic cbir system using RGB histograms. Now, I am trying to generate average precision and ranking curves. I need to know that, Is my formula for avg precision correct? and how to calculate average rankings?

Code:
% Dir: parent directory location for images folder c1, c2, c3
% inputImage: \c1\1.ppm
% For example to get P-R curve execute: CBIR('D:\visionImages','\c2\1.ppm');
function [  ] = demoCBIR( Dir,inputImage)
% Dir='D:\visionImages';
% inputImage='\c3\1.ppm';
tic;
S=strcat(Dir,inputImage);
Inp1=imread(S);
num_red_bins = 8;
num_green_bins = 8;
num_blue_bins = 8;
num_bins = num_red_bins*num_green_bins*num_blue_bins;

A = imcolourhist(Inp1, num_red_bins, num_green_bins, num_blue_bins);%input image histogram
srcFiles = dir(strcat(Dir,'\*.jpg'));  
B = zeros(num_bins, 100); % hisogram of other 100 images in category 1
ptr=1;
for i = 1 : length(srcFiles)
    filename = strcat(Dir,'\',srcFiles(i).name);
    I = imread(filename);% filter image
    B(:,ptr) = imcolourhist(I, num_red_bins, num_green_bins, num_blue_bins); 
    ptr=ptr+1;                                                   
end

%normal histogram intersection
a = size(A,2); b = size(B,2); 
K = zeros(a, b);
for i = 1:a
  Va = repmat(A(:,i),1,b);
  K(i,:) = 0.5*sum(Va + B - abs(Va - B));
end


  sims=K;
  for i=1: 100 % number of relevant images for dir 1
     relevant_IDs(i) = i;
  end

 num_relevant_images = numel(relevant_IDs);

 [sorted_sims, locs] = sort(sims, 'descend');
 locations_final = arrayfun(@(x) find(locs == x, 1), relevant_IDs);
 locations_sorted = sort(locations_final);
 precision = (1:num_relevant_images) ./ locations_sorted;
 recall = (1:num_relevant_images) / num_relevant_images;
 % generate Avg precision
 avgprec=sum(precision)/num_relevant_images;% avg precision formula
 plot(avgprec, 'b.-');
 xlabel('Category ID');
 ylabel('Average Precision');
 title('Average Precision Plot');
 axis([0 10 0 1.05]);
end

@Debasis - Though `trec_eval` is a standard for comparing precision and recall, I believe this is a homework assignment or some sort of assignment that has to be done in MATLAB, so offputting it to `trec_eval` isn't an option. In addition, you have to format your inputs in a very specific way. I also was the author of the code to calculate precision and recall in MATLAB that the OP is using — rayryeng, Oct 16 '14 at 17:09

rayryeng · Accepted Answer · 2014-10-14T18:08:33.517

Yup that's correct. You simply add up all of your precision values and average them. This is the very definition of average precision.

Average precision is simply a single number (usually in percentage) that gives you the overall performance of an image retrieval system. The higher the value, the better the performance. Precision-Recall graphs give you more granular detail on how the system is performing, but average precision is useful when you are comparing a lot of image retrieval systems together. Instead of plotting many PR graphs to try and compare the overall performance of many retrieval systems, you can just have a table that compares all of the systems together with a single number that specifies the performance of each - namely, the average precision.

Also, it doesn't make any sense to plot the average precision. When average precision is normally reported in scientific papers, there is no plot.... just a single value! The only way I could see you plotting this is if you had a bar graph, where the y-axis denotes the average precision while the x-axis denotes which retrieval system you are comparing. The higher the bar, the better the accuracy. However, a table showing all of the different retrieval systems, each with their average precision is more than suitable. This is what is customarily done in most CBIR research papers.

To address your other question, you calculate the average rank by using the average precision. Calculate the average precision for all of your retrieval systems you are testing, then sort them based on this average precision. Systems that have higher average precision will be ranked higher.

Bharat · Answer 2 · 2014-10-15T17:31:46.820

0

This is what we use to compute average precision. There should be a randomize step, because you may have issues if you are giving discrete scores to images in case of ties if your ground truth images are at the top.

function ap = computeAP(label, score, gt)
    rand_index = randperm(length(label));
    label2 = label(rand_index);
    score = score(rand_index);
    [~, sids] = sort(score, 'descend');
    label2 = label2(sids);
    ids = find(label2 == gt);
    ap = 0;
    for j = 1:length(ids)
        ap  = ap + j / (ids(j) * length(ids));
    end
    fprintf('%f \n', ap);
end

edited Oct 15 '14 at 17:31

answered Oct 15 '14 at 07:20

Bharat

2,139
2
16
35

Can you tell me what "label" and "gt" is? – Oct 15 '14 at 18:50
gt is the ground truth and label is the label predicted by your algorithm – Bharat Oct 15 '14 at 22:33

How to calculate "Average Precision and Ranking" for CBIR system

2 Answers2

Linked