0

I have created this graph that simply shows the number of deaths against the days since the first death.

What I think would be interesting was to show the rate of change for each country but I'm struggling as to how to do this with the data.

An example of one country's data is below:

{
  "gb": {
    "data": [
      // last 4 records
      {
        "x": 26,
        "date": "31/03/2020",
        "y": 1793,
        "country": "GBR",
        "delta": 382
      },
      {
        "x": 27,
        "date": "01/04/2020",
        "y": 2357,
        "country": "GBR",
        "delta": 564
      },
      {
        "x": 28,
        "date": "02/04/2020",
        "y": 2926,
        "country": "GBR",
        "delta": 569
      },
      {
        "x": 29,
        "date": "03/04/2020",
        "y": 3611,
        "country": "GBR",
        "delta": 685
      }
    ]
  }

The delta is simply the difference from the day before.

I do remember from maths that the rate of change was dy/dx.

My first step would be to get the equation of a curve given.

I know how to do the rest but how can I get the equation of a line given a set of coordinates in javascript?

dagda1
  • 26,856
  • 59
  • 237
  • 450

2 Answers2

1

Real world data is messy. It would be difficult to find a function that would perfectly fit your data (and if you did new data points would certainly not fit your function). You can approximate your data with what is called regression. The simplest regression algorithm attempts to find a straight line that best fits your data (linear regression).

An example of linear regression from https://en.wikipedia.org/wiki/Linear_regression

An example of linear regression from https://en.wikipedia.org/wiki/Linear_regression. We are looking for the equation of the red line.

My approach would be to find a straight line that fits your data (using linear regression) and use its rate of change as a sort of an approximate average rate of change for your data. (Also note, functions which are not a straight line do not have a constant rate of change. That is, the rate of change is different at every point).

Now the spread of viruses does not exactly follow a straight line - it is exponential. Luckily plotting an exponential function on a logarithmic scale produces a straight line. So the first step would be to convert the data from (x, y) to (x, log y).

exponential growth of coronavirus from worldometers.info logarithmic scale from worldometers.info

Here the same data from worldometers.info of the spread of coronavirus after being converted to logarithmic scale.

Now that our data looks more or less like a straight line we can use linear regression to find the equation of the straight line that best fits the data.

A simple approach in javascript can be found here: Linear Regression in Javascript

function linearRegression(y,x) {
        var lr = {};
        var n = y.length;
        var sum_x = 0;
        var sum_y = 0;
        var sum_xy = 0;
        var sum_xx = 0;
        var sum_yy = 0;

        for (var i = 0; i < y.length; i++) {

            sum_x += x[i];
            sum_y += y[i];
            sum_xy += (x[i]*y[i]);
            sum_xx += (x[i]*x[i]);
            sum_yy += (y[i]*y[i]);
        } 

        lr['slope'] = (n * sum_xy - sum_x * sum_y) / (n*sum_xx - sum_x * sum_x);
        lr['intercept'] = (sum_y - lr.slope * sum_x)/n;
        lr['r2'] = Math.pow((n*sum_xy - sum_x*sum_y)/Math.sqrt((n*sum_xx-sum_x*sum_x)*(n*sum_yy-sum_y*sum_y)),2);

        return lr;
}

The function takes an array of y coordinates (in our case the Math.log of your original coordinates) and an array of x coordinates and returns the object lr. lr['slope'] contains the rate of change of the line - that would be an approximated average rate of change for your data.

This would at least be my approach. People more familiar with statistics and machine learning can certainly offer something better. Hope it helps.

domondo
  • 926
  • 7
  • 9
0

There is a d3-regression library which does that for a common type of curves

bumbeishvili
  • 1,342
  • 14
  • 27