1

How could I estimate the instantaneous throughput ? For example, in a way similar to what the browser does when downloading a file. It's not just a mean throughput, but rather the an instantaneous estimation, maybe with a 'moving average'. I'm looking for the algorithm, but you can specify it in c++. Ideally, it would not involve a thread (i.e., being continuously refreshed, say every second) but rather be only evaluated when the value is asked.

NaomiJO
  • 313
  • 1
  • 3
  • 13
  • There's no sane definition of "instantaneous throughput" in packet networks. It's almost-always zero except when it's infinity. You almost certainly want a mean throughput, but over a fixed timeperiod (e.g. average over last minute). – MSalters Jul 31 '12 at 08:17
  • okay. is this how firefox or other downloading programs work ? – NaomiJO Aug 03 '12 at 13:48
  • If they're lazy. Smarter programs use a more complex Moving Average function, which uses both recent and older averages but gives more weight to the recent averages. – MSalters Aug 03 '12 at 14:29

2 Answers2

2

You can use an exponential moving average, as explained here, but I'll repeat the formula:

accumulator = (alpha * new_value) + (1.0 - alpha) * accumulator

To achieve an estimation, suppose you intend to query the computation every second, but you want an average over the last minute. Then, here would be one way to get that estimate:

struct AvgBps {
    double rate_;            // The average rate
    double last_;            // Accumulates bytes added until average is computed
    time_t prev_;            // Time of previous update
    AvgBps () : rate_(0), last_(0), prev_(time(0)) {}
    void add (unsigned bytes) {
        time_t now = time(0);
        if (now - prev_ < 60) {       // The update is within the last minute
            last_ += bytes;           // Accumulate bytes into last
            if (now > prev_) {        // More than a second elapsed from previous
                // exponential moving average
                // the more time that has elapsed between updates, the more
                // weight is assigned for the accumulated bytes
                double alpha = (now - prev_)/60.0;
                rate_ = (1 -alpha) * last_ + alpha * rate_;
                last_ = 0;            // Reset last_ (it has been averaged in)
                prev_ = now;          // Update prev_ to current time
            }
        } else {                      // The update is longer than a minute ago
            rate_ = bytes;            // Current update is average rate
            last_ = 0;                // Reset last_
            prev_ = now;              // Update prev_
        }
    }
    double rate () {
        add(0);                       // Compute rate by doing an update of 0 bytes
        return rate_;                 // Return computed rate
    }
};

You should actually use a monotonic clock instead of time.

jxh
  • 69,070
  • 8
  • 110
  • 193
  • That's probably what I'm looking for, thanks a lot. I don't know in advanced how many times per second I will need to query the throughput value, and it won't be at fix time intervals. Would it be possible to compute the value just when it is accessed, as you did, but knowing that there is no predefined access times ? Could you also comment or explain your code more, please? I'm not really good at C++ and it's hard to read for a newbie, even though it's well written. – NaomiJO Aug 03 '12 at 13:53
  • @NaomiJO: The code doesn't care how often you query or how often you update. But if the update is longer than 60 seconds from the previous update, it just takes the current update as the rate over the last minute. – jxh Aug 03 '12 at 13:59
  • Thank you very much for the inline annotations, let me read that very carefully and I'll get back to you if I have any more questions. – NaomiJO Aug 03 '12 at 14:58
  • This algorithm is incorrect. Let's say you `add(1<<20)` at time=0 and then call `rate()` at time=1. You'll get a rate of 17476.3 B/s. Not what I'd expect, but okay, let's accept that the algorithm won't start producing correct results until after the first minute has elapsed. Rather than calling `rate()` at time=1, call it at time=59. You'll get about 1031100 B/s. The average went *up* despite no further calls to `add`?! In fact, the return value from `rate()` at time=59 depends on how often you had called `rate()` in the preceding minute. Counterintuitive at least — and probably wrong. – Matt Whitlock Mar 06 '21 at 21:19
  • @MattWhitlock I thought I had replied earlier, but ... Thanks for your input! Upon reading your comment, I examined my code and discovered I had applied the weight incorrectly, and have fixed it. – jxh May 05 '22 at 15:44
0

You probably want a boxcar average.

Just keep the last n values, and average them. For each subsequent block, subtract out the oldest and add in the most recent. Note that for floating point values, you may get some aggregated error, in which case you might want to recalculate the total from scratch every m values. For integer values of course, you don't need something like that.

user1277476
  • 2,871
  • 12
  • 10
  • That's an interesting idea. I don't get your remark regarding floating point values. Could you give a more detailed explanation of the problem that might occur. And what do you mean by 'subsequent block' (how is it refreshed when no packets are being sent ? Is it refreshed every x milliseconds ? Ideally, I would rather have something that is only refreshed when (1) a packet is sent or (2) the value needs to be accessed. This could take the form of a internal variable that keeps a record of the last time it was accessed) – NaomiJO Aug 03 '12 at 13:45
  • Floating point arithmetic is subject to slight errors of precision. Integer arithmetic is exact, assuming that you're only dealing with integers. http://docs.python.org/tutorial/floatingpoint.html I'd probably recompute from scratch every n blocks if using floating point. – user1277476 Aug 03 '12 at 16:36