0

I'm working on a problem out of Cracking The Coding Interview that asks: Given a 2-D graph with points on it, find a line which passes the most number of points.

The solution is to: Draw an infinite line between every two points and, using a hash table, track which line is most common. To find the most common line, we iterate through all line segments using a hash table to count the number of times we've seen each line.

The author goes on to say there's a complication: "we're definining two lines to be equal if the lines have the same slope and y-intercept. We are then, furthermore, hashing the lines based on these values (specifically based on the slope). The problem with floating point numbers cannot always be represented accurately in binary. We resolve this by checking if two floating point numbers are within an epsilon value of each other."

Here's where I'm confused. Even if the slope is a floating point, we can't use that as a hash key? If so, why not just hash the slope as a string instead? Why do we need to introduce into our code hashing based upon keys that are within epsilon of each other?

segue_segway
  • 1,490
  • 1
  • 22
  • 40
  • Because floating point numbers in computers are (in general) an approximation so depending on the method that you use to arrive at a number there might be minuscule differences between numbers which mathematically speaking should be the same. Have a look at https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html – london-deveoper Jan 31 '17 at 04:55

2 Answers2

-1

Have a look at the following example written in c++.

#include <stdio.h>
#include <stdlib.h>

int main() {
  double a=10.0;
  double b=a/3;
  double c=(b-3)*3;

  printf("a: %20.50lf\n", a);
  printf("b: %20.50lf\n", b);
  printf("c: %20.50lf\n", c);
  return 0;
}

'c' should be equal to 1 but due to floating point rounding the above code produces the following.

a: 10.00000000000000000000000000000000000000000000000000
b: 3.33333333333333348136306995002087205648422241210938
c: 1.00000000000000044408920985006261616945266723632812
london-deveoper
  • 533
  • 3
  • 8
-1
  1. The algorithm you are describing does not need any hash table.

    Use histogram instead. This Answer is exact example of this task in C++

  2. If you still want to use floats as keys

    Then you need to truncate them so they can be compared as binary. For example let assume you got (assuming C++ syntax):

    const float da=1.5*M_PI/180.0; // [rad] comparison precision
    float a;
    a=atan2(dy,dx); // [rad] your line angle from deltas
    a=floor(a/da);  // [da] truncated angle
    

    Where dx,dy is your line delta and da is your comparison precision angle. Now to access float a as binary for hashing purposes you can simply do this:

    union { float f32; DWORD u32; } b;
    b.f32=a;
    // here b.u32 is your hash key
    
Community
  • 1
  • 1
Spektre
  • 49,595
  • 11
  • 110
  • 380