0

I'm trying to build a hand gesture recognition system using accelerometer and other sensors. To do so more accurately, I need to implement a rolling window that spans 30 rows of data. However, my current code only reads the latest row of data.

Here is my code so far:

import numpy as np
from sklearn import svm
from sklearn import tree
from sklearn.ensemble import RandomForestClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix

import pandas as pd
df= pd.read_csv("A.csv", delimiter=',')

#########################################################################


from sklearn.model_selection import train_test_split
train, test = train_test_split(df, test_size = 0.2)


train_features = train[['F1','F2','F3','F4','F5','X','Y','Z','C1','C2']]
train_label = train['LABEL']

test_features = test[['F1','F2','F3','F4','F5','X','Y','Z','C1','C2']]
test_label = test['LABEL']

## SVM
model = svm.SVC(kernel='rbf', gamma=0.00000001, C=1)
model.fit(train_features.values, train_label.values)
model.score(train_features, train_label)
predicted_svm = model.predict(test_features)
print "svm"
print accuracy_score(test_label, predicted_svm)
print testing_lang
cn =confusion_matrix(test_label, predicted_svm)

My question is how do I implement a rolling window in the dataframe. I want to maintain a rolling window that spans 30 lines, but I don't what parts in this code should I modify.

I tried to base from the pandas documentation and added the rolling part here in the train_test_split part

train, test = train_test_split(df.rolling(30, win_type='triang'), test_size = 0.2)

But an error comes up stating

Expected sequence or array-like, got <class 'pandas.core.window.Window'>

Basically, I want to create a rolling window because I want the predictions to base on 30 rows collectively, not just the latest row of data.

How do I implement it properly?

  • using `df.rolling()` allows you to apply a function to a rolling window across your dataframe (e.g `df.rolling(30).mean()` will return a datafame with a rolling mean of window size 30 across columns). It doesn't that appear that the rolling method will do what you are looking to do however you might want to check https://stackoverflow.com/questions/40954560/pandas-rolling-apply-custom. – B.C Aug 23 '19 at 14:54

1 Answers1

0

Check out sklearn.model_selection.TimeSeriesSplit((n_splits=5, *, max_train_size=None)). By default it fixes the window to the beginning of the data, but if you use the parameter max_train_size=30 then you can get a rolling window that will only train on 30 observations for however many n_splits you decide

kevin_theinfinityfund
  • 1,631
  • 17
  • 18