How to determine when to use Set Features
vs Sequence Features
on a column and difference between them with some examples.
I'm trying to use Ludwig to perform classification. My dataset looks something like below:
- Letters here are just for representational purpose only
- For example feature 1 (alpha word) could stand for ^al lph pha ha$ (trigram here)
LABEL, Feature1, Feature2
X, A B C, D A E
X, B C K, K J L
Y, A D C, D A E
Y, B D E, J L R
name: Feature1_trigrams
type: set
level: words
encoder:
representation: dense
embedding_size: 10
embeddings_on_cpu: false
pretrained_embeddings: null
embeddings_trainable: true
dropout: false
initializer: null
regularize: true
reduce_output: sqrt
tied_weights: null
cell_type: lstm
bidirectional: true
num_layers: 2
reduce_output: null
preprocessing:
format: space
Should I be using Sequence instead?