0

I am trying to fine tune Universal Sentence Encoder and use the new encoder layer for something else.

import tensorflow as tf
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Dense, Dropout
import tensorflow_hub as hub

module_url = "universal-sentence-encoder"
model = Sequential([
    hub.KerasLayer(module_url, input_shape=[], dtype=tf.string, trainable=True, name="use"),
    Dropout(0.5, name="dropout"),
    Dense(256, activation="relu", name="dense"),
    Dense(len(y), activation="sigmoid", name="activation")
])

model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model.fit(X, y, batch_size=256, epochs=30, validation_split=0.25)

This worked. Loss went down and accuracy was decent. Now I want to extract just Universal Sentence Encoder layer. However, here is what I get. enter image description here

  1. Do you know how I can fix this nan issue? I expected to see encoding of numeric values.
  2. Is it only possible to save tuned_use layer as a model as this post recommends? Ideally, I want to save tuned_use layer just like Universal Sentence Encoder so that I can open and use it exactly the same as hub.KerasLayer(tuned_use_location, input_shape=[], dtype=tf.string).
E.K.
  • 4,179
  • 8
  • 30
  • 50

1 Answers1

0

Hoping this will help someone, I ended up solving this by using universal-sentence-encoder-4 instead of universal-sentence-encoder-large-5. I spent quite a lot of time troubleshooting but it was tough as there was no issue with the input data and the model was trained successfully. This might be due to gradient exploding issue but could not add gradient clipping or Leaky ReLU into the original architecture.

E.K.
  • 4,179
  • 8
  • 30
  • 50