14

This is an interview question: "Find all collinear points in a given set".

As I understand, they ask to print out the points, which lie in the same line (and every two points are always collinear). I would suggest the following.

  1. Let's introduce two types Line (pair of doubles) and Point (pair of integers).
  2. Create a multimap : HashMap<Line, List<Point>)
  3. Loop over all pairs of points and for each pair: calculate the Line connecting the points and add the line with those points to the multimap.

Finally, the multimap contains the lines as the keys and a list collinear points for each line as its value.

The complexity is O(N^2). Does it make sense ? Are there better solutions ?

Gareth Rees
  • 64,967
  • 9
  • 133
  • 163
Michael
  • 41,026
  • 70
  • 193
  • 341
  • 1
    I'm getting a feeling that this is not complete problem description. Some part was lost. – Nikita Rybak Dec 29 '10 at 21:08
  • 2
    @Michael Exactly as you said, n^2 solution is trivial and there's no better: result size is proportional to n^2. – Nikita Rybak Dec 29 '10 at 21:14
  • BTW, you can't represent any line with a pair of integers, you'll need a pair of real numbers. – Nikita Rybak Dec 29 '10 at 21:15
  • @Nikita. Yes, you are right: the line type is a pair of real numbers. I will fix the post. – Michael Dec 29 '10 at 21:22
  • @Nikita. The problem is probably not very difficult but it is just a phone screening question. The phone screen should be easy anyway. – Michael Dec 29 '10 at 21:25
  • 2
    @Michael Yeah, but this is the problem not requiring any thinking or algorithm. It's like asking to sum up squares of values in the NxN matrix. How do you solve it? You go through all values and add each squared to the result. – Nikita Rybak Dec 29 '10 at 21:32
  • Ok, Nikita. You convinced me :) It is interesting that another question of the same phone screen (count inversions in a given array) is more difficult. They probably start their screen from trivial questions (like this) and raise the bar. – Michael Dec 29 '10 at 21:47
  • Probably you should refer to this [post][1]. [1]: http://stackoverflow.com/questions/2734301/given-a-set-of-points-find-if-any-of-the-three-points-are-collinear – Ivan Xiao Nov 07 '11 at 01:47
  • Refer to http://stackoverflow.com/questions/4179581 for solution with O(n^2) complexity. – Jatin Sanghvi Feb 28 '12 at 17:04
  • 1
    Using a pair of doubles won't hash very well: the same line, but with a difference in the 12th decimal place of one of the doubles will hash somewhere else. – Evgeni Sergeev May 15 '15 at 09:04
  • I was asked this question too in ** interview – aerin Oct 12 '17 at 18:54
  • To find out all lines itself requires a minimum of O(n^2); n*(n-1)/2 combinations and possible lines are available. This is the best complexity possible for this problem timewise. – Karthik Kumar Viswanathan Oct 16 '17 at 03:33
  • Actually it is possible to represent the line with three integers. Two of them represent normalized (divided by GCD) vector between the points. The last one is cross product of the vector and one of the points. – fdermishin Jul 15 '18 at 19:03
  • Take a look at the two methods described in http://www.cs.princeton.edu/courses/archive/spring03/cs226/assignments/lines.html . To find 4 or more collinear points given N points, the brute-force method has a time complexity of O(N^4), but a better method does it in O(N^2 log N). – wsw Jan 06 '14 at 15:12

3 Answers3

7

Collinear here doesn't really make sense unless you fix on 2 points to begin with. So to say, "find all collinear points in a given set" doesn't make much sense in my opinion, unless you fix on 2 points and test the others to see if they're collinear.

Maybe a better question is, what is the maximum number of collinear points in the given set? In that case, you could fix on 2 points (just use 2 loops), then loop over the other points and check that the slope matches between the fixed points and the other points. You could use something like this for that check, assuming the coordinates are integer (you could change parameter types to double otherwise).

// ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
// Returns whether 3 points are collinear
// ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
bool collinear(int x1, int y1, int x2, int y2, int x3, int y3) {
  return (y1 - y2) * (x1 - x3) == (y1 - y3) * (x1 - x2);
}

So the logic becomes:

int best = 2;
for (int i = 0; i < number of points; ++i) {
  for (int j = i+1, j < number of points; j++) {
    int count = 2;
    for (int k = 0; i < number of points; ++k) {
      if (k==i || k==j)
        continue;

      check that points i, j and k are collinear (use function above), if so, increment count.
    }
    best = max(best,count); 
  }
}
dcp
  • 54,410
  • 22
  • 144
  • 164
  • Well, for this problem you could do better, in O(n^2*logn), but I'm not sure if I should post an answer to the modified question :) but +1 anyway. – Nikita Rybak Dec 29 '10 at 21:16
  • @Nikita - It was just the first thing I thought of. The other idea I had was using a bitmask and testing all possible sets of points from original sets, but I wasn't sure how big his original set of points was. But hey, you're the red TC algo guy, so your word has more weight than mine, I'd say post your answer if you have better ideas :). – dcp Dec 29 '10 at 21:20
  • Let's ask author if he approves changing the subject :) Or I could email you, the idea is real simple. – Nikita Rybak Dec 29 '10 at 21:35
  • Nikita, can you to describe your idea? – Guy Fawkes Sep 19 '12 at 13:15
  • 3
    Wouldn't this algo run in O(n^3)? – Divick Feb 14 '13 at 12:01
  • 2
    Why is calculating slope between i and j important if it never userd later on? – Maggie Nov 23 '13 at 20:13
  • @Maggie - Thanks, you are correct that is isn't needed. I removed that line. Sorry for the mistake. – dcp Nov 30 '13 at 13:48
5

solve the problem in dual plane, convert all the points in to line segments. And then run a plane sweep to report all the intersections. The lines in the dual plane will pass through same point represents collinear points. And the plane sweep can be done in O(nlogn) time.

sandeep
  • 51
  • 1
  • 1
  • 1
    One might want to add, that constructing the dual line arrangement takes O(n^2). – user695652 Feb 03 '14 at 19:11
  • @user695652 Constructing the dual lines takes O(n), as there is one line per point, and the parameters of the line are functions of the parameters of the point. However, how many intersections are there between n lines in the worst case? Consider a grid, (n/2)*(n/2), so it's BigTheta(n^2), so the plane sweep cannot be done in O(n lg n) time. – Evgeni Sergeev May 15 '15 at 09:02
  • @Evgeni Sergeev Constructing the line arrangement! takes time O(n^2), since as you pointed out the combinatorial complexity is already \Omega(n^2). Also since the original problem is 3SUM hard, its unlikely that a sub quadratic algorithm exists. – user695652 May 15 '15 at 14:04
  • @user695652 Yes, the word "arrangement" was the point of confusion. – Evgeni Sergeev May 15 '15 at 14:08
  • To be clear, 3+ lines passing through the same point would represent collinear points? – micycle Jun 16 '22 at 11:01
0

Are you sure your analysis of the runtime is correct? You say to loop over all the pairs of points, of which there are n*(n-1)/2, i.e. O(n^2). Then you add the line and the pair of points to the map. However, I don't think the time to add such a line + point combination is constant. This means overall your time is O(n^2 log n) not a constant times n^2, which is what O(n^2) means.

So the real question would be, given that it can be done in time O(n^2 log n) can it be done in time O(n^2). Clearly O(n^2) gives a lower bound as you must at the very least print out every pair of points, of which there are O(n^2). My feeling is this is a completely general sorting problem and that one cannot expect better than O(n^2 log n) in general. However a full proof of that fact may be non-trivial.

Another thing to beware of is that the set may contain zero or one points, and so you will need to check that your algorithm handles these cases correctly, otherwise write special cases for those.

Bill
  • 286
  • 1
  • 3
  • 10
  • The lower bound of $O(n^2)$ seems tight according to Chapter 2 of this [Ph.D. thesis](http://web.engr.illinois.edu/~jeffe/pubs/pdf/Thesis.pdf). One $O(n^2)$ algorithm is given in this paper: H. Edelsbrunner, J. O'Rourke, and R. Seidel. Constructing arrangements of lines and hyperplanes with applications. SIAM J. Comput., 15:341. – Meng Lu Jul 26 '14 at 06:21