Questions tagged [sentence-similarity]

Sentence similarity is a topic of Natural Language Processing that tries to find a semantic or syntactic matematical similarity between two or more sentences

231 questions
27
votes
3 answers

How to build semantic search for a given domain

There is a problem we are trying to solve where we want to do a semantic search on our set of data, i.e we have a domain-specific data (example: sentences talking about automobiles) Our data is just a bunch of sentences and what we want is to give a…
19
votes
2 answers

is there a way to check similarity between two full sentences in python?

I am making a project like this one here: https://www.youtube.com/watch?v=dovB8uSUUXE&feature=youtu.be but i am facing trouble because i need to check the similarity between the sentences for example: if the user said: 'the person wear red T-shirt'…
Bemwa Malak
  • 1,182
  • 1
  • 5
  • 18
12
votes
2 answers

Sentence similarity using keras

I'm trying to implement sentence similarity architecture based on this work using the STS dataset. Labels are normalized similarity scores from 0 to 1 so it is assumed to be a regression model. My problem is that the loss goes directly to NaN…
lila
  • 121
  • 1
  • 3
10
votes
1 answer

word2vec, sum or average word embeddings?

I'm using word2vec to represent a small phrase (3 to 4 words) as a unique vector, either by adding each individual word embedding or by calculating the average of word embeddings. From the experiments I've done I always get the same cosine…
David Batista
  • 3,029
  • 2
  • 23
  • 42
9
votes
2 answers

Siamese Network with LSTM for sentence similarity in Keras gives periodically the same result

I'm a newbie in Keras and I'm trying to solve the task of sentence similairty using NN in Keras. I use word2vec as word embedding, and then a Siamese Network to prediction how similar two sentences are. The base network for the Siamese Network is a…
MiVe93
  • 93
  • 1
  • 4
8
votes
2 answers

Sentence similarity models not capturing opposite sentences

I have tried different approaches to sentence similarity, namely: spaCy models: en_core_web_md and en_core_web_lg. Transformers: using the packages sentence-similarity and sentence-transformers, I've tried models such as distilbert-base-uncased,…
7
votes
5 answers

What is the best way to get accurate text similarity in python for comparing single words or bigrams?

I've got similar product data in both the products_a array and products_b array: products_a = [{color: "White", size: "2' 3\""}, {color: "Blue", size: "5' 8\""} ] products_b = [{color: "Black", size: "2' 3\""}, {color: "Sky blue", size: "5' 8\""}…
rom
  • 666
  • 2
  • 9
  • 31
6
votes
2 answers

How to determine if two sentences talk about similar topics?

I would like to ask you a question. Is there any algorithm/tool which can allow me to do some association between words? For example: I have the following group of sentences: (1) "My phone is on the table" "I cannot find the charger". # no…
user12907213
4
votes
3 answers

Finding most similar sentences among all in python

Suggestions / refer links /codes are appreciated. I have a data which is having more than 1500 rows. Each row has a sentence. I am trying to find out the best method to find the most similar sentences among all. What I have tried I have tried…
vivek
  • 61
  • 1
  • 1
  • 8
4
votes
0 answers

Siamese BiLSTM neural network with Manhattan distance give very different similarity score each time for the same test data

I'm applying Siamese Bidirectional LSTM (BiLSTM) using character-level sequences and embeddings for long texts. The embeddings model is Word2vec, the sequence length is None to handle variable sequence lengths (180-550), the batch size is 8 and the…
4
votes
1 answer

Use Spacy to find most similar sentences in doc

I'm looking for a solution to use something like most_similar() from Gensim but using Spacy. I want to find the most similar sentence in a list of sentences using NLP. I tried to use similarity() from Spacy (e.g. https://spacy.io/api/doc#similarity)…
Heraknos
  • 343
  • 3
  • 8
3
votes
4 answers

How to save a SetFit trainer locally after training

I am working on an HPC with no internet access on worker nodes and the only option to save a SetFit trainer after training, is to push it to HuggingFace hub. How do I go about saving it locally to disk? https://github.com/huggingface/setfit
3
votes
1 answer

Sentence Transformers in Python: "[E1002] Span index out of range"

As a programming noob, I am trying to find similar sentences in several hundreds of newspaper articles. I have tried my code with a smaller text sample which has worked brilliantly. Now, with a larger text file (using the same code), I get the error…
3
votes
1 answer

How to download and use the universal sentence encoder instead of loading it from url

I am using the universal sentence encoder to find sentence similarity. below is the code that i use to load the model import tensorflow_hub as hub model = hub.load("https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/3") Here,…
Jithin P James
  • 752
  • 1
  • 7
  • 23
3
votes
1 answer

sentence transformer how to predict new example

I am exploring sentence transformers and came across this page. It shows how to train on our custom data. But I am not sure how to predict. If there are two new sentences such as 1) this is the third example, 2) this is the example number three. How…
user2543622
  • 5,760
  • 25
  • 91
  • 159
1
2 3
15 16