Questions tagged [pairwise-distance]

32 questions
6
votes
2 answers

How does pytorch calculate matrix pairwise distance? Why isn't 'self' distance not zero?

If this is a naive question, please forgive me, my test code like this: import torch from torch.nn.modules.distance import PairwiseDistance list_1 = [[1., 1.,],[1., 1.]] list_2 = [[1., 1.,],[2.,…
Alex Luya
  • 9,412
  • 15
  • 59
  • 91
4
votes
4 answers

Pairwise similarity matrix between a set of vectors in PyTorch

Let's suppose that we have a 3D PyTorch tensor, where the first dimension represents the batch_size, as follows: import torch import torch.nn as nn x = torch.randn(32, 100, 25) That is, for each i, x[i] is a set of 100 25-dimensional vectors. I…
3
votes
1 answer

Pairwise Distance Dealing with NaNs

I have a pivot table from which I want to calculate the pairwise distance matrix between each day. As my dataset contains NaN values when I am using sklearn pairwise distances it yields at me. I like to if there is anyway to overcome this? The pivot…
3
votes
2 answers

Efficient implementation of pairwise distances computation between observations for mixed numeric and categorical data

I am working on a data science project in which I have to compute the euclidian distance between every pair of observations in a dataset. Since I am working with very large datasets, I have to use an efficient implementation of pairwise distances…
3
votes
4 answers

What does sklearn's pairwise_distances with metric='correlation' do?

I've put different values into this function and observed the output. But I can't find a predictable pattern in what is being outputed. Then I tried digging through the function itself, but its confusing because it can do a number of different…
tim_xyz
  • 11,573
  • 17
  • 52
  • 97
2
votes
3 answers

Generating a pairwise 'distance' matrix

I am trying to generate what I can best describe as a pairwise distance matrix from a data frame containing the distance between two neighboring points. These are not Euclidean distances, they are essentially the distance between points on a…
Aislin809
  • 23
  • 3
1
vote
1 answer

How can I loop a function through every combination of levels of a factor?

I have a dataset containing a set of variables and the coordinates describing their distributions in geographic space: set.seed(123) #example dataset: d <- data.frame(var=as.factor(rep(LETTERS[1:5],each=6)),x=runif(30),y=runif(30)) head(d) var…
eazyezy
  • 21
  • 4
1
vote
2 answers

Compute differences between all variable pairs in R

I have a dataframe with 4 columns. set.seed(123) df <- data.frame(A = round(rnorm(1000, mean = 1)), B = rpois(1000, lambda = 3), C = round(rnorm(1000, mean = -1)), D = round(rnorm(1000, mean = 0))) I would like to…
CyG
  • 382
  • 1
  • 12
1
vote
1 answer

PySpark pairwise distance between row

Now I am working with PySpark, and wondering is there a way to do pairwise distance between row. For instance, there is a dataset like this. +--------------------+------------+--------+-------+-------+ | product| Mitsubishi | Toyota |…
1
vote
0 answers

How to get pairwise distance for two lists of vectors in python

Say I got two lists of vectors: l1 = [v1, v2, v3] l2 = [v4, v5, v6] I know that with scipy I can get pairwise distances within a list but how do I do it for two so that the results look like that: [[d(v1, v4), d(v1, v5), d(v1,v6)], [d(v2, v4),…
Mario
  • 561
  • 3
  • 18
1
vote
0 answers

finding shortest distance between two points around waterways in r

Looking for some advice on the best approach to this problem. I need make a matrix of pairwise OVER LAND shortest distance between each pair of points for 309 points (see figure below). An approach I was thinking of following was creating a…
1
vote
1 answer

Pairwise differences between observations in two groups

I have two treatment groups in my data set and I am looking for a fast method for calculating the pairwise differences between observations in the first group and second group. How can I quickly create all the combinations of observations and take…
Emma Jean
  • 507
  • 3
  • 12
1
vote
1 answer

Python code in docker NOT using all available CPU cores (uses only one)

I am using AWS Batch to run a python script with few modules that run in parallel (in a docker container on AWS ECR). When I manually invoke the script on a Linux 16 core machine, I see 16 python processes executing the code in parallel. In hopes of…
1
vote
1 answer

Pairwise raster comparison in R: alternative to for-loop?

How to efficiently compare pairs of distribution rasters (raster layers containing only 0 and 1)? I need to get a measure of the similarity among ~6500 individual global rasters. Istat from SDMTools should do the job. Here is my code:…
SophiaL
  • 61
  • 7
1
vote
4 answers

Speed up of the calculation of the sum the point-wise difference in R

Suppose I have two datasets. The first one is: t1<-sample(1:10,10,replace = T) t2<-sample(1:10,10,replace = T) t3<-sample(1:10,10,replace = T) t4<-sample(11:20,10,replace = T) t5<-sample(11:20,10,replace = T) xtrain<-rbind(t1,t2,t3,t4,t5) xtrain …
1
2 3