Sentence similarity is a topic of Natural Language Processing that tries to find a semantic or syntactic matematical similarity between two or more sentences
Questions tagged [sentence-similarity]
231 questions
27
votes
3 answers
How to build semantic search for a given domain
There is a problem we are trying to solve where we want to do a semantic search on our set of data,
i.e we have a domain-specific data (example: sentences talking about automobiles)
Our data is just a bunch of sentences and what we want is to give a…

Jickson
- 5,133
- 2
- 27
- 38
19
votes
2 answers
is there a way to check similarity between two full sentences in python?
I am making a project like this one here:
https://www.youtube.com/watch?v=dovB8uSUUXE&feature=youtu.be
but i am facing trouble because i need to check the similarity between the sentences for example:
if the user said: 'the person wear red T-shirt'…

Bemwa Malak
- 1,182
- 1
- 5
- 18
12
votes
2 answers
Sentence similarity using keras
I'm trying to implement sentence similarity architecture based on this work using the STS dataset. Labels are normalized similarity scores from 0 to 1 so it is assumed to be a regression model.
My problem is that the loss goes directly to NaN…

lila
- 121
- 1
- 3
10
votes
1 answer
word2vec, sum or average word embeddings?
I'm using word2vec to represent a small phrase (3 to 4 words) as a unique vector, either by adding each individual word embedding or by calculating the average of word embeddings.
From the experiments I've done I always get the same cosine…

David Batista
- 3,029
- 2
- 23
- 42
9
votes
2 answers
Siamese Network with LSTM for sentence similarity in Keras gives periodically the same result
I'm a newbie in Keras and I'm trying to solve the task of sentence similairty using NN in Keras.
I use word2vec as word embedding, and then a Siamese Network to prediction how similar two sentences are.
The base network for the Siamese Network is a…

MiVe93
- 93
- 1
- 4
8
votes
2 answers
Sentence similarity models not capturing opposite sentences
I have tried different approaches to sentence similarity, namely:
spaCy models: en_core_web_md and en_core_web_lg.
Transformers: using the packages sentence-similarity and sentence-transformers, I've tried models such as distilbert-base-uncased,…

Diego Miguel
- 531
- 4
- 13
7
votes
5 answers
What is the best way to get accurate text similarity in python for comparing single words or bigrams?
I've got similar product data in both the products_a array and products_b array:
products_a = [{color: "White", size: "2' 3\""}, {color: "Blue", size: "5' 8\""} ]
products_b = [{color: "Black", size: "2' 3\""}, {color: "Sky blue", size: "5' 8\""}…

rom
- 666
- 2
- 9
- 31
6
votes
2 answers
How to determine if two sentences talk about similar topics?
I would like to ask you a question. Is there any algorithm/tool which can allow me to do some association between words?
For example: I have the following group of sentences:
(1)
"My phone is on the table"
"I cannot find the charger". # no…
user12907213
4
votes
3 answers
Finding most similar sentences among all in python
Suggestions / refer links /codes are appreciated.
I have a data which is having more than 1500 rows. Each row has a sentence. I am trying to find out the best method to find the most similar sentences among all.
What I have tried
I have tried…

vivek
- 61
- 1
- 1
- 8
4
votes
0 answers
Siamese BiLSTM neural network with Manhattan distance give very different similarity score each time for the same test data
I'm applying Siamese Bidirectional LSTM (BiLSTM) using character-level sequences and embeddings for long texts. The embeddings model is Word2vec, the sequence length is None to handle variable sequence lengths (180-550), the batch size is 8 and the…

MManahi
- 41
- 3
4
votes
1 answer
Use Spacy to find most similar sentences in doc
I'm looking for a solution to use something like most_similar() from Gensim but using Spacy.
I want to find the most similar sentence in a list of sentences using NLP.
I tried to use similarity() from Spacy (e.g. https://spacy.io/api/doc#similarity)…

Heraknos
- 343
- 3
- 8
3
votes
4 answers
How to save a SetFit trainer locally after training
I am working on an HPC with no internet access on worker nodes and the only option to save a SetFit trainer after training, is to push it to HuggingFace hub. How do I go about saving it locally to disk?
https://github.com/huggingface/setfit

Tanish Bafna
- 33
- 4
3
votes
1 answer
Sentence Transformers in Python: "[E1002] Span index out of range"
As a programming noob, I am trying to find similar sentences in several hundreds of newspaper articles. I have tried my code with a smaller text sample which has worked brilliantly. Now, with a larger text file (using the same code), I get the error…

Mathias
- 51
- 3
3
votes
1 answer
How to download and use the universal sentence encoder instead of loading it from url
I am using the universal sentence encoder to find sentence similarity. below is the code that i use to load the model
import tensorflow_hub as hub
model = hub.load("https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/3")
Here,…

Jithin P James
- 752
- 1
- 7
- 23
3
votes
1 answer
sentence transformer how to predict new example
I am exploring sentence transformers and came across this page.
It shows how to train on our custom data. But I am not sure how to predict. If there are two new sentences such as 1) this is the third example, 2) this is the example number three. How…

user2543622
- 5,760
- 25
- 91
- 159