0

I am trying to read a file of the following format

id1 1 2 3
id2 2 4 6
id3 5 6 7
...

using this code

Dataset::Dataset(ifstream &file) {
    string token;
    int i = 0;
    while (!file.eof() && (file >> token)){
        // read line tokens one-by-one
        string ID = token;
        vector<int> coords;
        while ((file.peek()!='\n') && (!file.eof()) && (file >> token)) {
            coords.push_back(atoi(token.c_str()));
        }
        points.push_back(new Point(ID, coords));
        i++;
    }
    cout << "Loaded " << i << " points." << endl;
}

But it tells me I have read 0 points. What am I doing wrong?

Edit: I am openning this using input_stream.open(input_file) and file.good() returns true.

Edit #2: actually .good() returns true the first time and then false. What is that all about?

Edit #3: GUYS. IT'S FREAKING WINDOWS. When i put the path as Dataset/test.txt by cin it works and when I do it like Dataset\test.txt by the commandline it doesn't...

Now the problem is it seems not stop at new lines!

Edit #4: Freaking windows again! It was peeking '\r' instead of '\n'.

Michael
  • 325
  • 3
  • 14
  • 2
    It sounds like you may need to learn how to use a debugger to step through your code. With a good debugger, you can execute your program line by line and see where it is deviating from what you expect. This is an essential tool if you are going to do any programming. Further reading: [How to debug small programs](http://ericlippert.com/2014/03/05/how-to-debug-small-programs/) and [Debugging Guide](http://idownvotedbecau.se/nodebugging/) – NathanOliver Nov 04 '21 at 16:15
  • Maybe the file is not open. Maybe you put your data file in the wrong location or named it wrongly. Maybe the file does not have the contents you expect. Debugging should help you. Set a breakpoint in `Dataset::Dataset(ifstream &file)` and debug your code 1 line at a time after the breakpoint is hit. Look at the variables and flow after each statement is executed. – drescherjm Nov 04 '21 at 16:17
  • 1
    Why are you storing `Point*` instead of `Point` in the vector? – Ted Lyngmo Nov 04 '21 at 16:23
  • I want the points to be on the heap once and not get copied when passing them by value. – Michael Nov 04 '21 at 16:24
  • I can't use CLion's debugger right now because I am using the terminal with WSL (Ubuntu) and CLion won't detect MinGW or whatever – Michael Nov 04 '21 at 16:28
  • 1
    @Michael "_I want the points to be on the heap once and not get copied when passing them by value_" - Why would you need pointers in the vector to avoid copies? The only copies you'll get is if the vector resizes - but you'll hardly notice that. – Ted Lyngmo Nov 04 '21 at 16:30
  • how else would you do it without needing to create a temp object on the stack and then pass it by value in .push_back()? – Michael Nov 04 '21 at 16:32
  • 2
    There is vector::emplace_back – drescherjm Nov 04 '21 at 16:35
  • and there is also `std::move` - [example](https://godbolt.org/z/1PPsjhrsn) – Ted Lyngmo Nov 04 '21 at 16:36
  • @Michael What you are concerned with is absolutely premature optimization. Using pointers for this is _not_ what you want. – Ted Lyngmo Nov 04 '21 at 16:39
  • Is it because it won't be continuous in memory? I guess you re right! I am switching to emplace_back()! – Michael Nov 04 '21 at 16:42
  • Why the call to `peek()`? There are other ways to read your data. – Thomas Matthews Nov 04 '21 at 16:43
  • Regarding your edit: `Dataset\test.txt` - use `Dataset/test.txt`. That should work in most environments. Windows included. – Ted Lyngmo Nov 04 '21 at 16:43
  • Where's the definition of `Point`? Did you consider overloading `operator>>`? – Thomas Matthews Nov 04 '21 at 16:44
  • No. It only has a constructor. How do I stop the inside while-loop at '\n'? – Michael Nov 04 '21 at 16:51
  • @Michael Is the number of `int`s after the ID fixed? Is it always 3? – Ted Lyngmo Nov 04 '21 at 17:06
  • Yes, it's a matrix but for some stupid reason its dimensions are not to be read apriori. – Michael Nov 04 '21 at 17:09
  • But I mean, is it _always_ 3 - in all files you will ever read? If so, you can just make a fixed array of 3 or 3 `int`s. – Ted Lyngmo Nov 04 '21 at 17:10
  • No it could be different in different input files. That's why I went for a vector. It's the same for the same file. – Michael Nov 05 '21 at 14:54

4 Answers4

3

Here's an idea: overload operator>>:

struct Point
{
    int x, y, z;
    friend std::istream& operator>>(std::istream& input, Point& p);
};

std::istream& operator>>(std::istream& input, Point& p)
{
    input >> p.x;
    input >> p.y;
    input >> p.z;
    input.ignore(10000, '\n'); // eat chars until end of line.
    return input;
}

struct Point_With_ID 
  : public Point
{
    std::string id;
    friend std::istream& operator>>(std::istream& input, Point_With_ID& p);
};


std::istream& operator>>(std::istream& input, Point_With_ID& p)
{
    input >> p.id;
    input >> static_cast<Point&>(p);  // Read in the parent items.
    return input;
}

Your input could look like this:

std::vector<Point_With_ID> database;
Point_With_ID p;
while (file >> p)
{
    database.push_back(p);
}

I separated the Point class so that it can be used in other programs or assignments.

Ted Lyngmo
  • 93,841
  • 5
  • 60
  • 108
Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154
0

You should not use eof() in a loop condition. See Why is iostream::eof inside a loop condition considered wrong? for details. You can instead use the following program to read into the vector of Point*.


#include <iostream>
#include <sstream>
#include <fstream>
#include <vector>
class Point 
{
    public:
        std::string ID = 0;
        std::vector<int> coords;
    Point(std::string id, std::vector<int> coord): ID(id), coords(coord)
    {
        
    }
};
int main()
{
    
    std::vector<Point*> points;
    std::ifstream file("input.txt");
    std::string line;
    int var = 0;
    while (std::getline(file, line, '\n'))//read line by line
    {
        int j = 0;
        std::istringstream ss(line);
        
        std::string ID;
        ss >> ID;
        
        std::vector<int> coords(3);//create vector of size 3 since we already know only 3 elements needed
        
        while (ss >> var) { 
            coords.at(j) = var;
            
            ++j;
        }
        
        points.push_back(new Point(ID, coords));

    }
    std::cout<<points.size()<<std::endl;

    //...also don't forget to free the memory using `delete` or use smart pointer instead
    return 0;
}

The output of the above program can be seen here.

Note that if you're using new then you must use delete to free the memory that you've allocated. This was not done in the above program that i have given since i only wanted to show how you can read the data in your desired manner.

Jason
  • 36,170
  • 5
  • 26
  • 60
0

I managed to make it work by accounting for both '\r' and '\n' endings and ignoring trailing whitespace like this:

Dataset::Dataset(ifstream &file) {
  string token;
  int i = 0;
  while (file >> token){
      // read line tokens one-by-one
      string ID = token;
      vector<int> coords;
      while ((file.peek()!='\n' && file.peek()!='\r') && (file >> token)) {   // '\r' for windows, '\n' for unix
        coords.push_back(atoi(token.c_str()));
        if (file.peek() == '\t' || file.peek() == ' ') {   // ignore these
            file.ignore(1);
        }
      }
      Point p(ID, coords);
      points.emplace_back(p);
      i++;
      // ignore anything until '\n'
      file.ignore(32, '\n');
   }
   cout << "Loaded " << i << " points." << endl;
}

Probably not the best of the solutions suggested but it's working!

Michael
  • 325
  • 3
  • 14
  • This looks like you didn't pay much attention to the advice and answers you've gotten. You are also not using `emplace_back` correctly. That usage _will_ copy the `Point`. You don't need a temporary `Point`. Just do: `points.emplace_back(std::move(ID), std::move(coords));` – Ted Lyngmo Nov 04 '21 at 17:19
  • That's fair. I am just used to using dynamically allocated objects to avoid copies. Uni was strict about C++98 so there probably is no move constructors there. – Michael Nov 05 '21 at 19:39
  • "_Uni was strict about C++98_" - Ok, that will leave a dent - and no, move semantics didn't exist in C++98. – Ted Lyngmo Nov 06 '21 at 06:26
0

You've baked everything up in a complex deserializing constructor. This makes the code hard to understand and maintain.

  • You have a coordinate, so make class for that, we can call it Coord, that is capable of doing its own deserializing.
  • You have a Point, which consists of an ID and a coordinate, so make a class for that, that is capable of doing its own deserializing.
  • The Dataset will then just use the deserializing functions of the Point.
  • Don't limit deserializing to ifstreams. Make it work with any istream.

Deserializing is often done by overloading operator>> and operator<< for the types involved. Here's one way of splitting the problem up in smaller parts that are easier to understand:

struct Coord {
    std::vector<int> data;

    // read one Coord
    friend std::istream& operator>>(std::istream& is, Coord& c) {        
        if(std::string line; std::getline(is, line)) { // read until end of line
            c.data.clear();
            std::istringstream iss(line); // put it in an istringstream
            // ... and extract the values:
            for(int tmp; iss >> tmp;) c.data.push_back(tmp);
        }
        return is;
    }
    // write one Coord
    friend std::ostream& operator<<(std::ostream& os, const Coord& c) {
        if(not c.data.empty()) {
            auto it = c.data.begin();
            os << *it;
            for(++it; it != c.data.end(); ++it) os << ' ' << *it;
        }
        return os;
    }
};
struct Point {
    std::string ID;
    Coord coord;

    // read one Point
    friend std::istream& operator>>(std::istream& is, Point& p) {
        return is >> p.ID >> p.coord;
    }
    // write one Point
    friend std::ostream& operator<<(std::ostream& os, const Point& p) {
        return os << p.ID << ' ' << p.coord;
    }
};
struct Dataset {
    std::vector<Point> points;

    // read one Dataset
    friend std::istream& operator>>(std::istream& is, Dataset& ds) {
        ds.points.clear();
        for(Point tmp; is >> tmp;) ds.points.push_back(std::move(tmp));

        if(!ds.points.empty()) is.clear();
        return is;
    }
    // write one Dataset
    friend std::ostream& operator<<(std::ostream& os, const Dataset& ds) {
        for(auto& p : ds.points) os << p << '\n';
        return os;
    }
};

If you really want a deserializing constructor in Dataset you just need to add these:

    Dataset() = default;
    Dataset(std::istream& is) { 
        if(!(is >> *this))
            throw std::runtime_error("Failed reading Dataset");
    }

You can then open your file and use operator>> to fill the Dataset and operator<< to print the Dataset on screen - or to another file if you wish.

int main() {
    if(std::ifstream file("datafile.dat"); file) {
        if(Dataset ds; file >> ds) {      // populate the Dataset
            std::cout << ds;              // print the result to screen
        }
    }
}

Demo

Ted Lyngmo
  • 93,841
  • 5
  • 60
  • 108