The below python code displays sentence similarity, it uses Universal Sentence Encoder to achieve the same.
from absl import logging
import tensorflow as tf
import tensorflow_hub as hub
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import re
import seaborn as sns
module_url = "https://tfhub.dev/google/universal-sentence-encoder/4"
model = hub.load(module_url)
print ("module %s loaded" % module_url)
def embed(input):
return model(input)
def plot_similarity(labels, features, rotation):
corr = np.inner(features, features)
print(corr)
sns.set(font_scale=2.4)
plt.subplots(figsize=(40,30))
g = sns.heatmap(
corr,
xticklabels=labels,
yticklabels=labels,
vmin=0,
vmax=1,
cmap="YlGnBu",linewidths=1.0)
g.set_xticklabels(labels, rotation=rotation)
g.set_title("Semantic Textual Similarity")
def run_and_plot(messages_):
message_embeddings_ = embed(messages_)
plot_similarity(messages_, message_embeddings_, 90)
messages = [
"I want to know my savings account balance",
"Show my bank balance",
"Show me my account",
"What is my bank balance",
"Please Show my bank balance"
]
run_and_plot(messages)
The output is displayed as heatmap as shown below, also printing the values
I want to only focus on the sentences that seems quite similar, however the currently heatmap displays all the values.
So
Is there a way I can view heatmap with only values whose ranges is more than 0.6 and less than 0.999?
Is it possible to print the matching value pairs, which lie under given ranges, i.e. 0.6 and 0.99? Thanks, Rohit