How to find the closest point to the first point on the data file? C++

Question

My data file looks like below:

x              y               z
0.068472     -0.024941       0.028884
....         ....            ....
continued, there are more than 100 points.

I want to find the closest point among all the data points to point 1 (in this case (0.068472,-0.024941, 0.028884). My code to read the file is below, what function should I add to find the closest point to point 1? Should I use the minimum function to find the minimum distance between point 1 and the other? But I am not sure how to write this in code.

// Program to read an input file 

#include <iostream>
#include <fstream>
#include <string>
#include <algorithm>
using namespace std;

int main() {
    const int MAXI = 1000;
    double x, y, z, xcoordinates[MAXI], ycoordinates[MAXI], zcoordinates[MAXI];
    int i, count;

    ifstream inFile("input-week6-ad-q4-2.txt"); // ifstream function to read the file
    string line, c; // To read the characters

    if (inFile.is_open()) {
        getline(inFile, line); // To read the header of the input file then     discard it
        getline(inFile, line);

        i = 0;
        count = 0;
        while (inFile >> x >> y >> z) {
            xcoordinates[count] = x;
            ycoordinates[count] = y;
            zcoordinates[count] = z;
            count = count + 1;
        }

        for (i = 0; i < count; i++) {
            cout << xcoordinates[i] << " " << ycoordinates[i] << " " << zcoordinates[i] << "\n";
        }

        inFile.close();
    } else {
        cout << "The file could not be opened." << "\n"; // To check for any error
    }

    system("pause");
    return 0;
}

I'd start with defining a `Point` structure, instead of storing the coordinates in 3 separate arrays. Next step is to implement a function that calculates the distance between 2 `Point` variables. — πάντα ῥεῖ, Mar 16 '19 at 01:36
I'm not sure what you mean by point structure. I'm a beginner, could you give me an example? — JJL, Mar 16 '19 at 01:44
`struct Point { int x; int y; int z; };` and have an array `Point points[MAXI];` — πάντα ῥεῖ, Mar 16 '19 at 01:47
If you want to find 1's closest neighbour, there is no point in storing all the points in an array / arrays. That will also allow you to process files of any lengths, while your current approach is bound to consume unused memory for files of less than 1000 lines and -- worse-- fail to those that have more. — dedObed, Mar 16 '19 at 09:45

David C. Rankin · Answer 1 · 2019-03-16T10:20:44.080

The comments offer the right direction. If you are going to write your minimum distance finder is C++, you should start with a simple 2d point class and then derive a class to handle 3d points from that class by adding a 3rd coordinate. If you are simply going to use separate x, y, z coordinate and three separate arrays of double -- you might as well write the program in C.

Writing a base class for a 2d point isn't difficult at all. The only thing you need to be mindful of in order to then derive a 3d class from it is to declare your coordinate members as protected: so that all protected members of the 2d point class are available as protected member in the 3d class (class members are private by default and private members of the base are never accessible unless friended)

So what would a minimum 2d point base class look like? Well you would need x, y coordinates, you would need a default constructor to set x and y to 0.0 when the class is constructed, a constructor to takexandyvalues, and then a couple of accessor functions to get thexandy` values for use in your distance function.

A minimum 2d point class could be:

/* 2D Cartesian Coordinate Point */
class point2_t {
  protected:        /* allows derived class access to x, y when inherited */
    double x, y;    /* private members would not be accessible */
  public:
    point2_t () { x = 0.0, y = 0.0; }   /* constructors */
    point2_t (const double a, const double b) : x{a}, y{b} { }
    const double& getx () const { return x; }   /* access functions */
    const double& gety () const { return y; }
    double dist (const point2_t& p) {           /* distance function */
        return sqrt ((x-p.getx()) * (x-p.getx()) +
                     (y-p.gety()) * (y-p.gety()));
    }
};

That will allow you to initialize a 2d point with values, get the values currently set and then calculate the distance from some other 2d point. While that will work great, it would still require reading the x and y values from the file and then creating a point by passing the coordinates to the constructor. (your could also write a setx(double x) and corresponding sety() to allow you to change the x, y values)

It would be really nice to be able to just cin >> point; and have it set the x, y values automatically and to be able to cout << point; to output the coordinates. You can do so by overloading the << and >> operators. That makes it really convenient to read and output the coordinate data. To do so you can add the following as member functions:

    /* overload output and input operators */
    friend std::ostream& operator << (std::ostream& os, const point2_t& p) {
        os << "(" << p.x << ", " << p.y << ")";
        return os;
    }
    friend std::istream& operator >> (std::istream& is, point2_t& p) {
        is >> p.x >> p.y;
        return is;
    }

Once you have your 2d point class written, all you need to do is derive a 3d point class from it and add a z coordinate and the corresponding functions to handle all three coordinates instead of two. The basic form to derive a class from a base class including the protected members of the base class is:

class derived : public base {
    /* additions */
};

A simple derivation from your 2d point class for a 3d point class (including the overloading << and >> operators) could be:

/* 3D Cartesian Coordinate Point derived from 2D point class */
class point_t: public point2_t {
  protected:
    double z;   /* add z coordinate */
  public:
    point_t () { point2_t (0.0, 0.0); z = 0.0; };   /* default construct */
    /* construct with initializer list */
    point_t (const double a, const double b, const double c) :
                point2_t (a, b), z{c} {}
    const double& getz () const { return z; }       /* add getz accessor */
    double dist (const point_t& p) {                /* extend distance */
        return sqrt ((x-p.getx()) * (x-p.getx()) +
                     (y-p.gety()) * (y-p.gety()) +
                     (z-p.getz()) * (z-p.getz()));
    }
    /* extend operators */
    friend std::ostream& operator << (std::ostream& os, const point_t& p) {
        os << "(" << p.x << ", " << p.y << ", " << p.z << ")";
        return os;
    }
    friend std::istream& operator >> (std::istream& is, point_t& p) {
        is >> p.x >> p.y >> p.z;
        return is;
    }
};

Now you have a 3d point class that can calculate the distance between points. All that remains is creating an instance of the class for your 1st point, and a second temporary instance to read additional points from your file allowing you to compute the distance between the two. (a 3rd instance is handy if you want to save the coordinates for the closest point)

The only caveat with your data file is you need to discard the first line containing the x y z heading. While you can read a the line into a string with getline and simply ignore it, C++ also provides a stream function .ignore() which allows you to ignore up to the maximum number of readable characters until a delimiter is reached (the newline in this case). Simply include the limits header and you can then use:

    std::ifstream f (argv[1]);  /* open file stream */
    ...
    /* discard 1st line in file */
    f.ignore(std::numeric_limits<std::streamsize>::max(), '\n');

(either way works)

There is no need to read all the points in the file into a container to process later just to find the minimum of the distance between the first point and the rest. All you need to do is store the first point (p1 below) and then compute the distance between it and the remaining points, saving the minimum distance (distmin below) found for each subsequent comparison. (you can also save the coordinate of the closest point if you like)

Putting that together in a short main() could look like:

int main (int argc, char **argv) {

    if (argc < 2) { /* validate argument available for filename */
        std::cerr << "error: insufficient input.\n";
        return 1;
    }

    std::ifstream f (argv[1]);  /* open file stream */
    point_t p1, min, tmp;       /* 1st, mininum & temporary points */
    /* initialize minimum distance to maximum allowable */
    double distmin = std::numeric_limits<double>::max();

    /* discard 1st line in file */
    f.ignore(std::numeric_limits<std::streamsize>::max(), '\n');

    if (!(f >> p1)) {   /* read 1st point */
        std::cerr << "error: failed read of p1\n";
        return 1;
    }
    while (f >> tmp) {  /* read remaining points */
        double dist = tmp.dist (p1);    /* get distance from p1 */
        if (dist < distmin) {           /* check less than distmin? */
            distmin = dist;             /* set new distmin */
            min = tmp;                  /* set new closest point */
        }
    }
    /* output results */
    std::cout << "\nclosest point to " << p1 << "\n\n" << min <<
                "\n\ndistance: " << distmin << '\n';
}

The complete example would then be:

#include <iostream>
#include <iomanip>
#include <fstream>
#include <cmath>
#include <limits>

/* 2D Cartesian Coordinate Point */
class point2_t {
  protected:        /* allows derived class access to x, y when inherited */
    double x, y;    /* private members would not be accessible */
  public:
    point2_t () { x = 0.0, y = 0.0; }   /* constructors */
    point2_t (const double a, const double b) : x{a}, y{b} { }
    const double& getx () const { return x; }   /* access functions */
    const double& gety () const { return y; }
    double dist (const point2_t& p) {           /* distance function */
        return sqrt ((x-p.getx()) * (x-p.getx()) +
                     (y-p.gety()) * (y-p.gety()));
    }
    /* overload output and input operators */
    friend std::ostream& operator << (std::ostream& os, const point2_t& p) {
        os << "(" << p.x << ", " << p.y << ")";
        return os;
    }
    friend std::istream& operator >> (std::istream& is, point2_t& p) {
        is >> p.x >> p.y;
        return is;
    }
};

/* 3D Cartesian Coordinate Point derived from 2D point class */
class point_t: public point2_t {
  protected:
    double z;   /* add z coordinate */
  public:
    point_t () { point2_t (0.0, 0.0); z = 0.0; };   /* default construct */
    /* construct with initializer list */
    point_t (const double a, const double b, const double c) :
                point2_t (a, b), z{c} {}
    const double& getz () const { return z; }       /* add getz accessor */
    double dist (const point_t& p) {                /* extend distance */
        return sqrt ((x-p.getx()) * (x-p.getx()) +
                     (y-p.gety()) * (y-p.gety()) +
                     (z-p.getz()) * (z-p.getz()));
    }
    /* extend operators */
    friend std::ostream& operator << (std::ostream& os, const point_t& p) {
        os << "(" << p.x << ", " << p.y << ", " << p.z << ")";
        return os;
    }
    friend std::istream& operator >> (std::istream& is, point_t& p) {
        is >> p.x >> p.y >> p.z;
        return is;
    }
};

int main (int argc, char **argv) {

    if (argc < 2) { /* validate argument available for filename */
        std::cerr << "error: insufficient input.\n";
        return 1;
    }

    std::ifstream f (argv[1]);  /* open file stream */
    point_t p1, min, tmp;       /* 1st, mininum & temporary points */
    /* initialize minimum distance to maximum allowable */
    double distmin = std::numeric_limits<double>::max();

    /* discard 1st line in file */
    f.ignore(std::numeric_limits<std::streamsize>::max(), '\n');

    if (!(f >> p1)) {   /* read 1st point */
        std::cerr << "error: failed read of p1\n";
        return 1;
    }
    while (f >> tmp) {  /* read remaining points */
        double dist = tmp.dist (p1);    /* get distance from p1 */
        if (dist < distmin) {           /* check less than distmin? */
            distmin = dist;             /* set new distmin */
            min = tmp;                  /* set new closest point */
        }
    }
    /* output results */
    std::cout << "\nclosest point to " << p1 << "\n\n" << min <<
                "\n\ndistance: " << distmin << '\n';
}

Example Input File

Generating a few additional random points in the same range as your values would give you a data file with 10 total points to use to validate the program, e.g.

$ cat dat/3dpoints-10.txt
x              y               z
0.068472     -0.024941       0.028884
-0.023238      0.028574      -0.021372
 0.015325     -0.086100       0.011980
-0.028137     -0.025350       0.021614
-0.013860      0.015710      -0.022659
 0.026026     -0.093600       0.019175
 0.010445     -0.098790       0.023332
-0.021594      0.017428      -0.025986
 0.021800     -0.027678       0.017078
-0.016704      0.017951       0.011059

Example Use/Output

Running the program will then locate the closest point to your first point (p1) providing the following answer:

$ ./bin/point_distmin dat/3dpoints-10.txt

closest point to (0.068472, -0.024941, 0.028884)

(0.0218, -0.027678, 0.017078)

distance: 0.0482198

Look things over and let me know if you have questions. cpprefernce.com is one of the best references (aside from the standard itself). Keep that bookmark handy and take some time to get to know the language and the site.

Deriving a 3-d coordinate from a 2-d coordinate is just wrong. 3d point is-NOT-a 2d point. Try to compute the distance of (0, 0, 100) and (0, 0, -100) through 2-d pointers to them. — dedObed, Mar 16 '19 at 12:48
Why would you try and compute a distance in 3D space using pointers to a 2D class? See [Derived classes](https://en.cppreference.com/w/cpp/language/derived_class) — David C. Rankin, Mar 16 '19 at 20:42
You have said that *3d point is-a 2d point* through the inheritance. So it should be perfectly legit to say `point_2t *p = new point_t();`, right? See the L in [SOLID](https://en.wikipedia.org/wiki/SOLID): "Objects in a program should be replaceable with instances of their subtypes without altering the correctness of that program.". An obvious smell that gives away the problem is that the name of the base `point_2t` is more specific than that of the derivative `point_t`, should be vice versa. And in implementation, the way `z` is treated differently than `x` and `y` is also a smell. — dedObed, Mar 16 '19 at 21:09
No, I've said 3d point class was derived from 2d point class with access to its public and protected members as public and protected members in the 3d point class. There was no intent to make the derived class usable as an alias of the base class. You are free to cut-and-paste my answer into one of your on with improvements to demonstrate a superior approach. — David C. Rankin, Mar 16 '19 at 21:36
Yes, that is what you've said. Then though, there is what the compiler thinks (3d is-a 2d point) and how people understand public inheritance (derivative is-a base, 3d point is-a 2d point). Note also how the `point_2t` is hardly used in the program: Actually, only its trivial constructor and `getx()` and `gety()` are ever called. But ok, if you insist, I'll make an answer on top of yours :-) — dedObed, Mar 16 '19 at 21:47
Sure, I just went and read the [Liskov Substitution Principle](https://web.archive.org/web/20150905081111/http://www.objectmentor.com/resources/articles/lsp.pdf) and do not disagree with what you are saying. I look forward to, and will happily upvote your answer. — David C. Rankin, Mar 16 '19 at 22:02
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/190181/discussion-between-dedobed-and-david-c-rankin). — dedObed, Mar 16 '19 at 23:51
Thank you for the reply but as I'm a beginner, I can't understand how the code works. Is there any other possible way to write the code in a simpler form? — JJL, Mar 17 '19 at 05:35
Yes, if you want to ignore the class derivation, then you can do it like @dedObed did in his answer. That is a much simpler approach. The key here is you are not using any of the C++ class features, there isn't much difference between using C++ or C. Take at the other answer. — David C. Rankin, Mar 17 '19 at 05:39

score 1 · Answer 2 · answered Mar 16 '19 at 23:11

This answer heavily builds on David C. Rankin's. The main() is pretty much copy-pasted with two extra checks, explicit stream closing and some style changes. The chief difference is the way points are stored and thus, treated. No inheritance here. And it's only POD struct anyway.

Data, data, data. You think about the task in terms of points, so you should have a datatype to neatly hold coordinates together as one point:

struct Point3d {
    double x, y, z;
};

To cooperate fluently with C++ i/o streams, let's overload the >> and << operators:

std::ostream& operator << (std::ostream& os, const Point3d& p) {
    os << "(" << p.x << ", " << p.y << ", " << p.z << ")";
    return os;
}

std::istream& operator >> (std::istream& is, Point3d& p) {
    is >> p.x >> p.y >> p.z;
    return is;
}

Finally, we need to compute the distance between two points. Metrics are symmetric by logic and also by definition, so let us reflect it in code and define a simple function to compute Euclidean distance:

double distance(const Point3d &a, const Point3d &b) {
    auto squared = std::pow(a.x-b.x, 2) +
                   std::pow(a.y-b.y, 2) +
                   std::pow(a.z-b.z, 2);
    return sqrt(squared);
}

Then the whole program is:

#include <iostream>
#include <iomanip>
#include <fstream>
#include <cmath>
#include <limits>

struct Point3d {
    double x, y, z;
};

std::ostream& operator << (std::ostream& os, const Point3d& p) {
    os << "(" << p.x << ", " << p.y << ", " << p.z << ")";
    return os;
}

std::istream& operator >> (std::istream& is, Point3d& p) {
    is >> p.x >> p.y >> p.z;
    return is;
}

double distance(const Point3d &a, const Point3d &b) {
    auto squared = std::pow(a.x-b.x, 2) +
                   std::pow(a.y-b.y, 2) +
                   std::pow(a.z-b.z, 2);
    return sqrt(squared);
}

int main(int argc, char **argv) {
    if (argc != 2) {
        std::cerr << "Exactly one argument expected, got " << argc << "\n";
        return 1;
    }

    std::ifstream f(argv[1]);
    if (!f.is_open()) {
        std::cerr << "error: failed to open '" << argv[1] << "'\n";
        return 1;
    }

    // discard the header line
    f.ignore(std::numeric_limits<std::streamsize>::max(), '\n');

    Point3d first_pt;
    if (!(f >> first_pt)) {  // read the first point
        std::cerr << "error: failed read of the first point\n";
        return 1;
    }

    bool other_points = false;
    double dist_min = std::numeric_limits<double>::max();
    Point3d closest, current;    
    while (f >> current) {  // loop through the other points
        other_points = true;
        double dist = distance(first_pt, current);
        if (dist < dist_min) {
            dist_min = dist;
            closest = current;
        }
    }
    f.close();

    if (other_points) {
        std::cout << "closest point to " << first_pt <<
                     " is " << closest << " [distance: " << dist_min << "]\n";
    } else {
        std::cout << "There was only one point in the file\n";
    }
}

I like your code and pursuant to my comment am glad to UV it. However, I'm still a bit confused about our LSP discussion. I have been running tests declaring 2d pointers to 3d points in my code (including your suggested `0,0,100; 0,0,-100` check) and I arrive at the correct answer each time. (e.g. a zero distance in 2D space and a 200 distance in 3D space) and referencing the base class member functions with a 2d pointer to 3d object work?? — David C. Rankin, Mar 16 '19 at 23:33
Thank you for the reply but as I'm a beginner, I can't understand how the code works. Is there any possible way to write the code in a simpler form? — JJL, Mar 17 '19 at 05:34
@JJL You're provoking me to be mean :-) What is it exactly that you don't understand? — dedObed, Mar 17 '19 at 09:21
I haven't learnt functions such as iomanip and cmath but I'm trying to search them up and understand them. But it seems to be a lot of things that I need to learn to understand this code, is it impossible to write the code only using iostream, fstream and possibly cmath or algorithm? With for and while loops with arrays — JJL, Mar 17 '19 at 15:18
@JJL If there is some specific step/function that keeps you puzzled even after googling ([cppreference](https://en.cppreference.com/w/) is really a good place), than SO will be a great place for a follow-up question. But overall, I'd recommend you to grab a [C++ textbook](https://stackoverflow.com/a/388282/9703830) (that answer is a true gem) and read it first. C++ is not really a language one could grasp from "zero to fluency" just by examples. — dedObed, Mar 17 '19 at 15:28
Okay thank you. I have a question: how do I set argv[1] as the filename? I think the code does not work as it cannot read the data file. — JJL, Mar 18 '19 at 01:41
@JJL That is very nicely covered in [this question](https://stackoverflow.com/questions/3024197/what-does-int-argc-char-argv-mean). Which -- at least for me -- came first when Googling "c++ argv". — dedObed, Mar 18 '19 at 01:46
Never mind, I didn't know where to edit the command line argument, now the code works — JJL, Mar 18 '19 at 17:28

hdvt · Answer 3 · 2019-03-16T04:55:22.870

0

You can calculate the Euclidean distances of two points in 3 dimensions (point 1 vs the other points), then compare them to find the closest point. The formula could be found on Wiki: https://en.wikipedia.org/wiki/Euclidean_distance

edited Mar 16 '19 at 04:55

answered Mar 16 '19 at 04:49

hdvt

1
2

How to find the closest point to the first point on the data file? C++

3 Answers3