I have a dictionary called ret_docs
of retrieved documents (values) based on queries (keys) that looks like this:
{'q1': ['d51', 'd874', 'd486', 'd329', 'd114'],
'q2': ['d51', 'd1147', 'd12', 'd100', 'd114'],
'q3': ['d707', 'd144', 'd542', 'd329', 'd395'],
'q4': ['d189', 'd575', 'd179', 'd1182', 'd160'],
'q5': ['d730', 'd329', 'd1066', 'd14', 'd163'],
'q6': ['d798', 'd97', 'd99', 'd927', 'd1195'],
'q7': ['d189', 'd1347', 'd423', 'd1040', 'd174'],
'q8': ['d234', 'd122', 'd189', 'd160', 'd197'],
'q9': ['d270', 'd45', 'd97', 'd123', 'd193'],
'q10': ['d493', 'd302', 'd1199', 'd949', 'd1214']
And a dictionary of relevance judgements for each document called reljudges
:
{'q1': ['d184',
'd29',
'd31',
'd12',
'd51',
'd102',
'd13',
'd14',
'd15',
'd57',
'd378',
'd859',
'd185',
'd30',
'd37',
'd52',
'd142',
'd195',
'd875',
'd56',
'd66',
'd95',
'd462',
'd497',
'd858',
'd876',
'd879',
'd880'],
'q2': ['d12',
'd15',
'd184',
'd858',
'd51',
'd102',
'd202',
'd14',
'd52',
'd380',
'd746',
'd859',
'd948',
'd285',
'd390',
'd391',
'd442',
'd497',
'd643',
'd856',
'd857',
'd877',
'd864',
'd658'],
'q3': ['d5', 'd6', 'd90', 'd91', 'd119', 'd144', 'd181', 'd399'],
'q4': ['d236', 'd166'],
'q5': ['d552', 'd401', 'd1297', 'd1296'],
'q6': ['d99', 'd115', 'd257', 'd258'],
'q7': ['d20', 'd56', 'd57', 'd58', 'd19'],
'q8': ['d48',
'd122',
'd20',
'd58',
'd196',
'd354',
'd360',
'd197',
'd999',
'd1112',
'd1005'],
'q9': ['d21', 'd22', 'd550'],
'q10': ['d259', 'd405', 'd302', 'd436', 'd437', 'd438', 'd998', 'd1011']
}
I have written up this code to calculate the "precision @ N":
if (n==-1) or (n>len(list(ret_docs.values())[0])):
ret_docs = ret_docs
else:
ret_docs = {key: value[:n] for key, value in ret_docs.items()}
rel_ret = {}
for key, val in ret_docs.items():
rel_ret[key] = []
for i in val:
if i in reljudges[key]:
rel_ret[key].append(i)
prec = {k: len(rel_ret[k])/len(ret_docs[k]) for k in ret_docs.keys() & reljudges}
Which outputs a dict that looks like this (dummy values):
{
'q1' : 0.1,
'q2' : 0.3,
...,
'q10': 0.2
}
I need to use these same ret_docs
and reljudges
to calculate the average precision and mean average precision. My problem is that my understanding of average precision is that it would be the precisions of multiple tested n values average out, but how do I know what to set the n value to be at, or is this not even necessary? Average precision would be a dictionary of values just like the prec
dictionary, except with the average precision for each query instead. I know that mean average precision would just be something like:
mean_avg_pre = np.array(list(avg_pre.values())).mean()