1

How to determine when to use Set Features vs Sequence Features on a column and difference between them with some examples.

I'm trying to use Ludwig to perform classification. My dataset looks something like below:

  • Letters here are just for representational purpose only
  • For example feature 1 (alpha word) could stand for ^al lph pha ha$ (trigram here)
LABEL, Feature1, Feature2
X,     A B C,    D A E
X,     B C K,    K J L
Y,     A D C,    D A E
Y,     B D E,    J L R

        name: Feature1_trigrams
        type: set
        level: words
        encoder:
          representation: dense
          embedding_size: 10
          embeddings_on_cpu: false
          pretrained_embeddings: null
          embeddings_trainable: true
          dropout: false
          initializer: null
          regularize: true
          reduce_output: sqrt
          tied_weights: null
        cell_type: lstm
        bidirectional: true
        num_layers: 2
        reduce_output: null
        preprocessing:
          format: space

Should I be using Sequence instead?

gogasca
  • 9,283
  • 6
  • 80
  • 125

0 Answers0