0

I'm using Indri with TrecEval and I'm wondering if we can use F-measure, precision, recall, with ranked retrieval results.

If yes, what the F-measure... will mean ? Are those values somehow relevant, like for evaluating if the queries are close to the corpus ?

I know that the MAP values are for evaluating the ranked results. But I'm wondering if F-measure... may be useful for something else. I'm confused here, and I made researches but there is something that I don't get.

Thank's for your help.

Alais
  • 1

1 Answers1

0

Precision, Recall, and F1 are set based measures. This means that they score a set of documents, not a ranking.

We typically evaluate these sort of measures at fixed numbers of top documents: 5,10,20,50,100,500,1000. Then we can plot a curve and it shows us the whole ranking somehow.

Or you will talk about the precision/recall at 20, e.g. within the first two pages of results for most interfaces. F1 isn’t used much for IR, as our ranking measures balance these anyway (AP, NDCG, etc).

F1@20 will give you a number representing the geometric mean of recall and precision within the best 10 documents according to your ranker.

John Foley
  • 957
  • 9
  • 19
  • First of all, thank's for your answer. If I get it, basically Precision will say something like "How many relevant documents do I have among my retrieved documents"? When MAP will be more likely "Are these relevant documents ranked in the first pages"? How maybe it would be P10 which will say that ? – Alais Mar 18 '18 at 19:44
  • MAP asks "What is the average precision at the recall points?" - which is roughly -- how early are all the relevant documents ranked? P@10 asks "How many relevant documents did I find in the top 10"? R@10 asks "What fraction of the total relevant documents have I found in the top 10". – John Foley Mar 19 '18 at 20:01