0

I have the following coordinates separated in 2 lists:

x = [1, 2, 3, 4, 5, 5]
y = [1, 2, 3, 4, 4, 5]

and I want to make a function that returns:

1 2 3 4 5
1 2 3 4 5

Every single code I try won't skip the x= 5 y= 4, help.

Sumito
  • 47
  • 4
  • Does this answer your question? [How do you remove duplicates from a list whilst preserving order?](https://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-whilst-preserving-order) – Ironkey Oct 19 '20 at 00:20

3 Answers3

1

you can use this function to remove duplicates:

deDupe = lambda x: list(dict.fromkeys(x))
x = [1, 2, 3, 4, 5, 5]
y = [1, 2, 3, 4, 4, 5]

deDupe = lambda x: list(dict.fromkeys(x))

print(deDupe(x))
print(deDupe(y))
>>> [1, 2, 3, 4, 5]
>>> [1, 2, 3, 4, 5]

To what I believe you asked and what @Mark Meyer suggested here's the way to only make save coordinate pairs if they are the same

[(x,y) for x,y in zip(x,y) if x == y]
Ironkey
  • 2,568
  • 1
  • 8
  • 30
  • I guess it might work well in some scenarios but I need it to return that 5, the one in x[5] – Sumito Oct 19 '20 at 00:20
  • so you want to return the number that occurs twice in a list? – Ironkey Oct 19 '20 at 00:20
  • It is kinda complex, but I only want to get the numbers that would make a straight line in a graph, if I add for example a 6 X and a 7 to Y it won't return them, because they wouldn't make a straight line in a Cartesian graph. – Sumito Oct 19 '20 at 00:26
  • 1
    @Sumito do you just want the values where `x == y`? Saying you just want the ones on a line is under-specified -- there is a straight line between `(1, 1)` and `(5, 4)` how do you know not to remove all the other instead of those two? You could easily have other scenarios where more than one straight line could be made. – Mark Oct 19 '20 at 00:38
1

It sounds like the scenario of this question is we have a set of points where an unknown subset of points is colinear, and we want to identify that colinear subset.

An excellent algorithm for this problem is random sample consensus or RANSAC. For line fitting, RANSAC is like linear regression but robust to outliers.

Line fitting with RANSAC:

  1. Randomly select two points from the original data.
  2. Fit a line through them.
  3. Then for all other data points, compare how close they are to the line. If they fit well, consider them part of the "consensus set".
  4. Repeat steps 1-3 several times, and accept the line for which the consensus set contains the most points.
  5. (Optional) Re-fit the line by linear regression to all the points in the consensus set.

The scikit-learn Python library has an implementation of RANSAC, see "Robust linear model estimation using RANSAC".

Pascal Getreuer
  • 2,906
  • 1
  • 5
  • 14
0

I haven't publish it to pylib, if you are going to do it in my way, follow this procedure:

  1. install git
  2. open terminal, direct your terminal to certain folder that you want to keep your files
  3. run following in terminal

git clone https://github.com/Weilory/python-regression

  1. open python-regression folder, copy and paste regression folder to your base level directory.

  2. in base level directory, which contains regression folder, create a test.py

  3. paste in following code:

from regression.regress import linear_regression

x = [1, 2, 3, 4, 5, 5]
y = [1, 2, 3, 4, 4, 5]

expression = linear_regression(x=x, y=y)
print(expression.write)
# y = 1.0 * x + 0.0

my_formula = expression.formula

res_x = []
res_y = []

for i, d in enumerate(x):
    if my_formula(d) == y[i]:
        res_x.append(d)
        res_y.append(y[i])

print(res_x)
print(res_y)
# [1, 2, 3, 4, 5]
# [1, 2, 3, 4, 5]
  1. run test.py in terminal python test.py.

Make sure numpy is installed globally on your machine.

Weilory
  • 2,621
  • 19
  • 35