3

I was trying to automate the creation of "row window features", using featuretools package but I couldn't find an easy way to create them. What I mean with "row window features" is that for each cutoff point I want to create features that extract time patterns. for example:

[columns] 
COUNT(orders) in 0to1 days    
COUNT(orders) in 1to2 days
COUNT(orders) in 2to3 days 
COUNT(orders) in 0to1 months    
...

I understand that there is a way to limit the "time window" for the features using the training_window parameter in ft.dfs(), but that's only a "lower bound" it's there an easy way to create that kind of features ?.

Pablo
  • 3,135
  • 4
  • 27
  • 43

1 Answers1

2

You can accomplish it by using multiple cutoff times to set the "upper bound" of time. However, then your feature values for COUNT(orders) in 0to1 days, COUNT(orders) in 1to2 days, etc will be on the rows. You would then reshape the resulting dataframe to have them in the columns.

Max Kanter
  • 2,006
  • 6
  • 16
  • Thanks, using something like `pd.DataFrame.shift()` ?, it will be really helpful to have that feature implemented in `featuretools` – Pablo Dec 15 '18 at 18:17