18

I need to resample time series in node.js. So I would like to know whether there is a tool in javascript which works similar as pandas in Python?

Lets say I have data which looks similar to this example:

[{
    "time": "28-09-2018 21:29:04",
    "value1": 1280,
    "value2": 800
},
{   
    "time": "28-09-2018 21:38:56",
    "value1": 600,
    "value2": 700
},
{
    "time": "29-09-2018 10:40:00",
    "value1": 1100,
    "value2": 300
},
{
    "time": "29-09-2018 23:50:48",
    "value1": 140,
    "value2": 300
}]

In Python I would put this data into a pandas dataframe and then resample it into a new dataframe with a different sample rate. In this example to daily data:

import pandas
df = pandas.DataFrame(...)
df_days = df.resample('1440min').apply({'value1':'sum', 'value2':'sum'}).fillna(0)

So my new data would look something like this:

[{
    "time": "28-09-2018 00:00:00",
    "value1": 1880,
    "value2": 1500
},
{   
    "time": "29-09-2018 00:00:00",
    "value1": 1240,
    "value2": 600
}]

What is in general the best way to do this in node.js / javascript ?

sunwarr10r
  • 4,420
  • 8
  • 54
  • 109
  • Possible duplicate of [Python Pandas equivalent in JavaScript](https://stackoverflow.com/questions/30610675/python-pandas-equivalent-in-javascript) – Maor Refaeli Oct 03 '18 at 10:27
  • have you found a way to solve this? – user299791 Jun 04 '19 at 13:21
  • Yes, I export the data as an excel sheet and then execute a python script, which loads the data and performs resampling on it. I can post the code if you like. Unfortunately I couldn't find a native nodejs solution. – sunwarr10r Jun 04 '19 at 13:49
  • @baermathias I'm trying to do something similar, would you mind posting the code? – koper89 Apr 04 '20 at 15:27

2 Answers2

1

I don't think you need a node.js/JS library for this task. What you want to achieve can be done with a reduce function.

var a = [{
    "time": "28-09-2018 21:29:04",
    "value1": 1280,
    "value2": 800
},
{   
    "time": "28-09-2018 21:38:56",
    "value1": 600,
    "value2": 700
},
{
    "time": "29-09-2018 10:40:00",
    "value1": 1100,
    "value2": 300
},
{
    "time": "29-09-2018 23:50:48",
    "value1": 140
}];

var b = Object.values(a.reduce((container, current) => {
  var date = current['time'].substring(0, 10);
  if (!container[date])
    container[date] = {time: date + ' 00:00:00', value1: current['value1'] || 0, value2: current['value2'] || 0};
  else {
    container[date]['value1'] += current['value1'] || 0;
    container[date]['value2'] += current['value2'] || 0;
  }
  return container;
}, {}));

This function create an object with keys the date and aggregate the values. You need to take care if the date exists or not in that object. With || 0 you manage if the property is not in a element to not break anything; and with Object.values you extract the values to have an array. Since you used the date as string I treated them as string but if they are Date object you have to adjust the common part where you declare date.

Side note, as always you can reference to a prop in js with ['value1'] or also with .value1. I stick to a more familiar pythonic syntax since it was mentioned.

Of course, this is just an example with daily resample, if you need a bigger/smaller quota you have to manipulate dates. Let's say we want to emulate a 12 hours resample, you write:

var resample = 12;
var b = Object.values(a.reduce((container, current) => {
  var date = new Date(current['time'].replace(/(\d+)-(\d+)-(\d+) (\d+):(\d+):(\d+)/, '$3-$2-$1T$4:$5:$6'));
  date.setHours(Math.floor(date.getHours() / resample) * resample);
  date.setMinutes(0);
  date.setSeconds(0);
  if (!container[date.toString()])
    container[date.toString()] = {time: date, value1: current['value1'] || 0, value2: current['value2'] || 0};
  else {
    container[date.toString()]['value1'] += current['value1'] || 0;
    container[date.toString()]['value2'] += current['value2'] || 0;
  }
  return container;
}, {}));

That regex replace is because the dates are not in ISO format, you could use a library for that, like moment or others, I wanted to show that it is possible to do all with just plain JS.

Remember one thing when using JS dates: if you are in the browser the timezone is the one of the client, if you are in a server the timezone is the same of the server. If time is timezone free I don't think there should be problems because it is all managed in the local timezone.

Ripper346
  • 662
  • 7
  • 22
0

Simple approach

  1. very simple flask app that can do pandas processing for you
  2. simple JQuery AJAX to use it.

HTML

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, minimum-scale=1.0, maximum-scale=1.0, user-scalable=no, viewport-fit=cover">
    <script src="https://code.jquery.com/jquery-3.5.1.min.js" integrity="sha256-9/aliU8dGd2tb6OSsuzixeV4y/faTqgFtohetphbbj0=" crossorigin="anonymous"></script>
</head>
<body>
    <main id="main">
        <section id="data-section">
            <h2>Data</h2>
            <div id="data"/>
        </section>
    </main>
</body>
<script>
    function apicall(url, data) {
        $.ajax({
            type:"POST", url:url, data:{data:JSON.stringify(data)},
            success: (data) => { $("#data").text(JSON.stringify(data)); }
        });
    }
    data = [{"time": "28-09-2018 21:29:04","value1": 1280,"value2": 800},{"time": "28-09-2018 21:38:56","value1": 600,"value2": 700},{"time": "29-09-2018 10:40:00","value1": 1100,"value2": 300},
            {"time": "29-09-2018 23:50:48","value1": 140,"value2": 300}];
    window.onload = function () {
        apicall("/handle_data", data);
    }
</script>
</html>

Flask App

import pandas as pd, json
from flask import Flask, redirect, url_for, request, render_template, Response

app = Flask(__name__)

@app.route('/')
@app.route('/home')
def home():
    return render_template('home.html')

@app.route('/handle_data', methods=["POST"])
def handle_data():
    df = pd.DataFrame(json.loads(request.form.get("data")))
    df["time"] = pd.to_datetime(df["time"])
    df.set_index("time", inplace=True)
    df = df.resample('1440min').apply({'value1':'sum', 'value2':'sum'}).fillna(0)
    return Response(json.dumps(df.to_dict(orient="records")),
                    mimetype="text/json")

if __name__ == '__main__':
    app.run(debug=True, port=3000)

output

enter image description here

Rob Raymond
  • 29,118
  • 3
  • 14
  • 30
  • 2
    This is not an answer to the question. It may use javascript, but the question is for a JS library that is equivalent to pandas for aggregation. – Sid Kwakkel Jan 26 '21 at 17:23
  • 1
    @SidKwakkel it's a full stack answer. It's so simple to build and deploy to any cloud stack (AWS EB, GCloud, Azure). Effectively heterogeneous micro services. Homogeneous solutions quite often land up more complicated – Rob Raymond Jan 26 '21 at 20:29
  • Nice idea, this could even be a cloud function – Gavin Feb 27 '22 at 09:12