1

I am trying to use feature tools to generate some new features using only some specified columns for the Titanic dataset. In my case I want to do a transform 'add_numeric' and 'multiply_numeric' on Age, Pclass and log10splitfare. I have followed the syntax given here to the best of my knowledge but no avail. The code below does not error out but it does not produce any additional columns. I also used this stackoverflow link as a reference.

es = ft.EntitySet(id = 'Titanic')
es.entity_from_dataframe(entity_id = 'data', dataframe = ftdataset_cleaned, 
                         make_index = False, index = 'index')

# Run deep feature synthesis with transformation primitives
feature_matrix, feature_defs = ft.dfs(entityset = es, target_entity = 'data',
                                      trans_primitives = ['add_numeric', 'multiply_numeric'],
                                      primitive_options= {('add_numeric', 'multiply_numeric'):{"include_entities": ['Age','PClass','log10SplitFare']}}
                                      )
Leo Torres
  • 673
  • 1
  • 6
  • 18

1 Answers1

6

You can use the include_variables option to specify which columns in an entity to use for specific primitives

feature_matrix, feature_defs = ft.dfs(
    entityset=es,
    target_entity='data',
    trans_primitives=['add_numeric', 'multiply_numeric'],
    primitive_options={
        ('add_numeric', 'multiply_numeric'): {
            'include_variables': {'data': ['Age', 'PClass', 'log10SplitFare']}}})

This guide goes a little more in depth about the different ways you can control how primitives are applied.