Algorithm for ranking popular blog posts

Question

I am building a blog aggregator like Techmeme that finds most popular posts from several blogs. Unlike Techmeme, first, I aggregate blog posts from a variety of RSS feeds, then save the headlines and relevant URLs in database. After that, I have to find what the most popular blog posts are.

For defining top blog post headlines, I track Facebook and Twitter share counts for every post of every blog and I rank the blog posts for their share counts. But that isn't the best solution because some bloggers can cheat via increasing their sharing counts with fraudulent shares.

So my question is what criterias could I use to define what the most popular posts are? What would be a better algorithm for ranking blog posts?

Google Trends gives a daily unique visitor count. However it doesn't look like there is any kind of official api for it. Not really sure how well it would work with blog posts, since I figure they likely aren't navigated to from a google search. http://trends.google.com/websites — Danny, Mar 05 '12 at 16:50
but there isn't data for all blogs or blog posts. there is only for globally popular ones. since my project is local, not global, this tool doesn't help me :( — , Mar 05 '12 at 16:56

score 2 · Accepted Answer · answered Apr 11 '12 at 15:36

2

Since the term 'popular' in this context is vague I would define the popularity of posts according to my criterias. Combine all suggested answers and make a reasonable reputation system for the blog posts. For instance, basically I would do something like this.

facebook share x 2
twitter share x 3
pagerank of the domain x 2
50 000 / global alexa rating
and so on

Finally, you may sum up all these and compare. Moreover, you can develop some criterias take into account of size of size of posts, number of images within the post, etc.

answered Apr 11 '12 at 15:36

seferov

4,111
3
37
75

How do you decide the multiplicative factor for shares/likes etc. I mean, why (Facebook share x 2) and not (Facebook share x 30) – Jayesh Dec 22 '13 at 08:54
@Jayesh I just made them up for the sake of the example. it is up to you (the importance you give to) – seferov Dec 22 '13 at 21:59
Thanks @Ferhad I just wanted to understand what's the process to get there. Is it always that you start with random weights and try to adjust the error over a period or is there any definitive way to get these? – Jayesh Dec 23 '13 at 04:36

score 0 · Answer 2 · answered Mar 07 '12 at 00:03

0

It may be possible to estimate the joint distribution of shares across different sources. It's hard to detect fraudulence for marginalized (i.e. single) metrics, but it's harder to fake a holistic "organic" profile.

answered Mar 07 '12 at 00:03

phs

10,687
4
58
84

score 0 · Answer 3 · answered Mar 08 '12 at 22:22

0

How about using variation of PageRank?

here is the more details. http://pr.efactory.de/e-pagerank-algorithm.shtml http://en.wikipedia.org/wiki/PageRank?PHPSESSID=e371f8cacb91eff0c852a0e001893a9a

answered Mar 08 '12 at 22:22

Andrew

7,619
13
63
117

Algorithm for ranking popular blog posts

3 Answers3

Linked