metrics to rank text files

Question

I have a set of text files in a particular domain. I need to rank the files based on some metric.

Please help me out with a few metrics that can be used to rank my text files (term frequency, size, frequency of use, etc..). I would then like to use text mining techniques to rank the files based on one of these techniques.

Explain better what you're trying to do, language and please paste some code that you've already done with the respective errors and questions. — Pabluez, Dec 20 '11 at 20:25
I Have a set of files on a particular domain and i need to rank them based on different metrics / basics . I have to think to different metrics based on which it can be ranked . And i am on the look out for different metrics — siddharth, Dec 21 '11 at 04:53
I aim at finding the best measure to rank files in a particular domain . I want the computer to work like an expert scholar and rank the files from a repository . i havent started coding as i am unable to move forward without solving this issue — siddharth, Dec 21 '11 at 05:02

score 0 · Accepted Answer · answered Dec 23 '11 at 03:30

0

The major issue that i had come across is to rank the documents according to thier relevance or some other metric .

Now i have come to a conclusion that documents ranked based on their content(relevance) provides better results.

I am making use of a vector based approach to rank documents based on the search words given in the query . I am not sure if that is the best approach but it provides results with average accuracy

answered Dec 23 '11 at 03:30

siddharth

153
9

I'm still not certain what you're trying to accomplish from your question, but I get a better sense from your answer here. This might be helpful, it is an answer to a slightly different (maybe) question, but maybe will help? http://stackoverflow.com/a/2278780/321143 – Ellie Kesselman Dec 23 '11 at 03:41

metrics to rank text files

1 Answers1