11

While displaying the download status in a window, I have information like:

1) Total file size (f)

2) Downloaded file size (f')

3) Current download speed (s)

A naive time-remaining calculation would be (f-f')/(s), but this value is way-to-shaky (6m remaining / 2h remaining / 5m remaining! deja vu?! :)

Would there be a calculation which is both stabler and not extremely wrong (showing 1h even when the download is about to complete)?

Jonathan Graehl
  • 9,182
  • 36
  • 40
Abhishek
  • 125
  • 6
  • I do not believe that smooth averaging would cut it; I have watched my Ubuntu update download half a gig worth of files, and I noticed a very clear pattern - larger files average a much faster speed (up to 5x or sometimes even 10x) than smaller files. If you do make assumption about the protocol being used (e.g. ftp, http), then you should be able to deduce why this variation occurs (notice that it is not easy due to all details involved, but not impossible either). Basically, as time goes by, I would try to improve the function which predicts the download speed and time as a function of size. – Hamish Grubijan Nov 06 '10 at 17:38

4 Answers4

14

We solved a similar problem in the following way. We weren't interested in how fast the download was over the entire time, just roughly how long it was expected to take based on recent activity but, as you say, not so recent that the figures would be jumping all over the place.

The reason we weren't interested in the entire time frame was that a download could so 1M/s for half an hour then switch up to 10M/s for the next ten minutes. That first half hour will drag down the average speed quite severely, despite the fact that you're now honkin' along at quite a pace.

We created a circular buffer with each cell holding the amount downloaded in a 1-second period. The circular buffer size was 300, allowing for 5 minutes of historical data, and every cell was initialized to zero.

We also maintained a total (the sum of all entries in the buffer, so also initially zero) and the count (zero, obviously).

Every second, we would figure out how much data had been downloaded since the last second and then:

  • subtract the current cell from the total.
  • put the current figure into that cell and advance the cell pointer.
  • add that current figure to the total.
  • increase the count if it wasn't already 300.
  • update the figure displayed to the user, based on total / count.

Basically, in pseudo-code:

def init (sz):
    buffer = new int[sz]
    for i = 0 to sz - 1:
        buffer[i] = 0 
    total = 0
    count = 0
    index = 0
    maxsz = sz

def update (kbps):
    total = total - buffer[index] + kbps
    buffer[index] = kbps
    index = (index + 1) % maxsz
    if count < maxsz:
        count = count + 1
    return total / count

You can change your resolution (1 second) and history (300) to suit your situation but we found 5 minutes was more than long enough that it smoothed out the irregularities but still gradually adjusted to more permanent changes in a timely fashion.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • shouldn't it be: _count = **min** (maxsz, count + 1)_ ? – Felix May 18 '11 at 16:01
  • @phix23: good catch. I can't believe nobody's noticed that for over a _year._ Annhoo, I've modified it to make it work correctly and removed dependency on a function altogether. – paxdiablo May 18 '11 at 23:11
  • 1
    i can't believe it either, maybe i'm the first who really used this code – Felix May 18 '11 at 23:34
10

Smooth s (exponential moving avg. or similar).

Jonathan Graehl
  • 9,182
  • 36
  • 40
4

I prefer using average speed for the last 10 seconds and divide remaining part with that. Dividing to current speed to way too unstable while dividing to average of whole progress cannot handle permanent speed changes (like another download is starting).

Cem Kalyoncu
  • 14,120
  • 4
  • 40
  • 62
0

Why not compute the download speed as an average over the whole download, that is:

s = f' / elapsed time

That way it would smooth out over time.

Drew Hall
  • 28,429
  • 12
  • 61
  • 81
  • 3
    think of an hour long download at 100kb/s you downloaded 50% of it. Then you started another download and speed goes down 50kb/s your download time indicator will be wrong until the end of your download. It will say 30min remaining while it clearly will take an our to download the rest. – Cem Kalyoncu Nov 24 '09 at 06:54
  • 2
    Actually, you cannot predict at all how long the second download will take. That second download might be for 50KB total. In that case, you'd show 1 hour remaining for all of 8 seconds, after which it rebounds to 30 minutes. Predicting is hard, especially when it comes to the future :P – MSalters Dec 02 '09 at 10:26