4

I'm trying to replicate reddit's hot algortithm for sorting my posts. Here's my function:

def hot(self):
    s = self.upvotes
    baseScore = log(max(s, 1))
    now = datetime.now()

    timeDiff = (now - self.post.date).days
    if (timeDiff > 1):
        x = timeDiff - 1
        baseScore = baseScore * exp(-8 * x * x)
    print('Final:', baseScore) #always prints 0
    return baseScore

basically, exp(-8 * x * x) always makes the number 0. So i'm curious how I'm supposed to make this algorithm work.

Any idea?

Zorgan
  • 8,227
  • 23
  • 106
  • 207
  • 4
    Well `exp(- very_big_number)` is always (very close to) `0`. – Willem Van Onsem Jun 01 '18 at 12:36
  • 2
    As @WillemVanOnsem is saying... you posts are just not very hot... – Olivier Melançon Jun 01 '18 at 12:37
  • taking the day is maybe a bit too aggressive : best case is 1( 0day), and then you jump to 3.10-4. Maybe use hours instead to be more progressive ? – CoMartel Jun 01 '18 at 12:39
  • In cas the post is two days old, this means that you already decrease it with 0.03%. So if your `baseScore` was 10k (which is again unreachable, since you use a `log`), then only 3 remains. For three days, the percentage is below what can be represented by a float, so `0`. So you "cool down" the posts too fast I guess. – Willem Van Onsem Jun 01 '18 at 12:43
  • To get your algorithm to work, you'll want to vary your exponential function so that the dropoff is not too steep given your time intervals. Your linked question gives a good explanation of how to do that. Tune the argument of `exp()` so that you can get the granularity you desire, since they are using `timeDiff = (now - track.uploaded).toWeeks` rather than days – C.Nivs Jun 01 '18 at 12:49

2 Answers2

2

The problem comes in at that line.

baseScore = baseScore * exp(-8 * x * x)

Since x only takes values in days, it will always be an integer. Now if x == 0, then you get exp(-8 * x * x) == 1, but as soon as x == 1, then it gets very close to 0. Bottom line: your function is not continuous.

What you want is to gradually decrease the hotness of a post. This can be achieved by letting x take values between 0 and 1. One way would be to take your time delta in minutes and thus allow for fractional days.

timeDiff = (now - self.post.date).minutes / 1440

Then posts would stay hot for a few hours.

Olivier Melançon
  • 21,584
  • 4
  • 41
  • 73
2

In short: you have created an extreme ice storm in which posts simply freeze to death after 48 hours.

There is nothing "wrong" with your algorithm, but you let the scores "cool down" too fast.

Imagine that a post is two days old (then the if clause) is triggered. In that case x = 1, and in that case the exp(..) will result in:

>>> exp(-8)
0.00033546262790251185

That's right. 0.00033..., or 0.03%. So that means if your post got 10 000 votes, the base score is 9.21, and after this multiplication, only:

>>> log(10000) * exp(-8)
0.003089724985059729

Yes, the cooling scheme should ensure that eventually everything cools down, but not by putting the posts into an ice storm.

You can for example remove the 8* factor. This means that the second day, we multiply the score with ~0.37 or 36.79%. You can experiment a bit with the factor or some other parts of the cooling scheme and thus let the posts cool down nicely.

Another aspect is that the time is quite descritized: you count the number of days. But that means that as long as the second day is not entirely over, the value is 1. But from the moment the second day is over, the "temperature" of the post makes a gigantic drop. You could use the number of seconds and divide by 86'400 instead:

timeDiff = (now - self.post.date).total_seconds() / 86400  # continuum
Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555