0

I came across this SQL problem and wonder if I could solve it in python pandas.

The SQL problem: Table point holds the x coordinate of some points on x-axis in a plane, which are all integers.

Write a query to find the shortest distance between two points in these points.

| x   |
|-----|
| -1  |
| 0   |
| 2   |

I have initiate the pandas dataframe as below.

import pandas as pd
point = pd.DataFrame(np.array([-1,0,2]), columns=['x'])

I was trying to do like

p1 = point
p2 = point.copy()
p1.merge(p2, on='x', condition = ('!='))

But I know there is no such conditions.

Can anyone provide a solution to this problem in pandas?

Note: My problem is different from join two dt when columns are not equal, so it is not a duplicate. My problem is about appending all values to each of the values in the original set. The problem in the link is about finding out the difference.

A story-teller
  • 113
  • 1
  • 9

2 Answers2

3
point = [-1, 0, 1]
ix = pd.MultiIndex.from_product([point, point], names=['a', 'b'])
df = pd.DataFrame(index=ix).reset_index()
df = df[df.a != df.b]

will (rather) efficiently get you a dataframe with the unique combinations of two coordinates, like an outer join would in SQL:

   a  b
1 -1  0
2 -1  1
3  0 -1
5  0  1
6  1 -1
7  1  0

After that, you can get the distance with

>>> (df.a - df.b).abs().min()
1
AKX
  • 152,115
  • 15
  • 115
  • 172
1

This is filter after cross join

p1.assign(key=1).merge(p2.assign(key=1),on='key').loc[lambda x : x['x_x']!=x['x_y']]
Out[29]: 
   x_x  key  x_y
1   -1    1    0
2   -1    1    2
3    0    1   -1
5    0    1    2
6    2    1   -1
7    2    1    0
BENY
  • 317,841
  • 20
  • 164
  • 234