-1

I am looking for help in creating a dynamically expanding array to import data from a .csv file. I do not want to have to see how large the file is and edit the variable in the source code/prompt the user, I just want the data to be imported then manipulated in various ways. First, my code as-is:

#include <fstream>
#include <sstream>
#include <iostream>

int main()
{

//declare variables and arrays
long rows = 170260;
int cols = 5;
double **rawData = new double*[rows]; //on heap because of size
for(long pi = 0; pi < rows; ++pi) //create an array of pointers
{
         rawData[pi] = new double[cols];
}
char buff[200];
double deltaT;
double carDeltaV;
double *carV = new double[rows]; //on heap because of size

//import raw data
std::cout << "Importing filedata.csv...";

std::ifstream rawInput("filedata.csv");

for(long r = 0; r < rows; ++r)
{
      rawInput.getline(buff, 200);
      std::stringstream ss(buff);

      for(int c = 0; c < cols; ++c) 
      {
            ss.getline(buff, 40, ',');
            rawData[r][c] = atof(buff);
      }
}

std::cout << "Done." << std::endl;

//create speed matrix
carV[0] = 0;

std::cout << std::endl << "Creating speed matrix...";

for (long i = 1; i < rows; ++i) 
{

    deltaT = rawData[i][0] - rawData[i-1][0];
    carDeltaV = rawData[i-1][3] * deltaT;
    carV[i] = carDeltaV + carV[i-1];
}

std::cout << "Done." << std::endl;

//write data to csv file
std::cout << std::endl << "Writing data to file...";

std::ofstream outputData;
outputData.open("outputdata.csv");

for(long r = 0; r < rows; ++r)
{
         outputData << rawData[r][0] << "," << rawData[r][3]/.00981 << ",";
         outputData << carV[r] << std::endl;
}

outputData.close();
std::cout << "Done." << std::endl;

//delete pointers
std::cout << std::endl << "Clearing memory...";

for(long pj = 0; pj < rows; ++pj)
{
         delete [] rawData[pj];
}
delete [] rawData;
delete [] carV;

std::cout << "Done." << std::endl;

std::cin.get();
return 0;

}

Note: The amount of colums will always be 5. The rows are my unknown. An example of what I will be importing can be seen below:

0.001098633,0.011430004,0.002829004,-0.004371409,0.00162947
0.001220703,0.00606778,0.001273052,0.003497127,0.002359922
0.001342773,0.003104446,-0.000848701,0.012385657,-0.008119254

There is more to it, but this should be enough to understand what I am trying to accomplish. I have read up on vectors a bit, but the concept of a vector-of-vectors is a bit confusing to me, and I have tried to implement it with no success. Also, I read that a deque might be what I am looking for? I have no experience with those, and it seems to me that it may be overkill for my application since I am only appending in one direction to an array of data.

Disclaimer: I am pretty much a novice at C++, so if there are any concepts that you feel would be above my level of skill please let me know so I can read up on it.

Any advice?

Edit: By request, this is how I tried to do this with vectors.

std::vector<double> rawDataRow;
std::vector< std::vector<double> > rawDataMatrix;

//import raw data loop
std::ifstream rawInput("test.csv");

for(int i = 1; i > 0; ) {
          rawInput.getline(buff, 200);
          std::stringstream ss(buff);

          for(int c = 0; c < cols; ++c) {
                  ss.getline(buff, 40, ',');
                  value = atof(buff);
                  rawDataRow.push_back(value);

                  std::cout << rawDataRow[0] << std::endl;
          }
          timeDiff = timeAfter - timeBefore;
          timeBefore = timeAfter;
          timeAfter = rawDataRow[0];

          rawDataMatrix.push_back(rawDataRow);
}

where "i" would be set to 0 at eof.

trincot
  • 317,000
  • 35
  • 244
  • 286
  • 8
    Use a `std::vector` with a suitable type `T` to hold the elements and use `vector.push_back(value)` to append each record. The class will take care of growning as needed. – Dietmar Kühl Dec 17 '14 at 22:21
  • Can you show your code with vectors? – Anton Poznyakovskiy Dec 17 '14 at 22:27
  • Are you only using the first and fourth column? If so, you can save effort by not converting string to double the other three values per row. – George Houpis Dec 17 '14 at 22:27
  • @DietmarKühl That is what I tried to implement, but was not able to do. How would this be handled in 2D? – snickodonnell Dec 17 '14 at 22:29
  • Actually `deque` sounds like it's perfect for this application precisely because you're only appending at the end. – John Dibling Dec 17 '14 at 22:29
  • `vector > my2dvector;` – John Dibling Dec 17 '14 at 22:30
  • @GeorgeHoupis I am using every column, but did not show what I am doing with that data for clarity. – snickodonnell Dec 17 '14 at 22:35
  • @AntonPoznyakovskiy I edited how I tried to do it with vectors. I am not sure what you mean by access elements of my table randomly. I will always need to refer back to the data in the vector, but I will always start from the beginning and run through until the end. – snickodonnell Dec 17 '14 at 22:36
  • In your vector implementation, `rawDataRow` keeps the data after each iteration of the for loop. Consider `rawDataRow.clear()` after pushing. Or define it inside the loop. – Anton Poznyakovskiy Dec 17 '14 at 22:37
  • @AntonPoznyakovskiy Can I create an array that (temporarily) stores the contents of the row, then store that array's contents to a vector? Or must I do this with a vector? (are vector-of-arrays possible? or can it only be a vector-of-vectors?) – snickodonnell Dec 17 '14 at 22:40

3 Answers3

0

To sum up the questions that have arisen in the discussion:

You cannot have a vector of arrays, see there: Correct way to work with vector of arrays You can have a vector of pointers to arrays. However, at this point I wouldn't mess with all the memory handling.

The best is if you keep your code with vectors, except that you should put the definition of rawDataRow inside the loop to purge its contents on every iteration.

std::vector< std::vector<double> > rawDataMatrix;

//import raw data loop
std::ifstream rawInput("test.csv");

for(int i = 1; i > 0; ) {
      std::vector<double> rawDataRow;
      rawInput.getline(buff, 200);
      std::stringstream ss(buff);

      // do the rest
}
Community
  • 1
  • 1
Anton Poznyakovskiy
  • 2,109
  • 1
  • 20
  • 38
0

It seems you are making your life just too hard. The key realization is, however, that you always need to check input before using it in some form. Once you do that things fall into place easily.

To conveniently deal with input of a line, the first thing I'd define is this simple manipulator which would ignore a comma:

std::istream& comma(std::istream& in) {
    if ((in >> std::ws).peek() == ',') {
        in.ignore(); // the happy case: just skip over the comma
    }
    else if (!in.peek() == std::char_traits<char>::eof()) {
        in.setstate(std::ios_base::failbit); // unhappy: not the end and not a comma
    }
    return in;
}

With this in place, it is fairly easy to read lines and split them into cells:

std::vector<std::vector<double>> result;
for (std::string line; std::getline(in, line); ) {
    std::istringstream lin(line);
    std::vector<double> row;
    for (double d; d >> lin >> comma; ) {
        row.push_back(d);
    }
    if (!lin.eof()) {
        in.setstate(std::ios_base::failbit);
    }
    std::result.push_back(row);
}
if (!in.eof()) {
    std::cout << "there was an input error\n";
}
else {
    // result contains the result of reading...
}

I haven't tested the code and I'd guess there are typos somewhere but the general approach should just work...

Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
  • I was definitely making the problem more difficult than it needed to be. With the help of you and the other comments I was able to learn more and (I hope) become a better programmer. – snickodonnell Dec 18 '14 at 14:54
0

First, you should split your program into three parts:

  1. Reading data from an input file
  2. Processing the data
  3. Writing data to an output file

Your main program should basically look like this:

int main() {
  vector<InputRecord> data = read_from_csv("filedata.csv");
  vector<double> speeds = compute_speeds(data);
  write_to_csv("result.csv", data, speeds);
  return 0;
}

Now you need to define what an InputRecord is. You said that it’s an array of 5 doubles, but that’s not the best description. It should be more like this:

struct InputRecord {
  double timestamp;
  double field2;
  double field3;
  double location;
  double field5;
};

Using this data structure, you can write data[0].timestamp instead of data[0][0], which means you don’t need the comments anymore.

Here is the complete code that I wrote for this task. It does a similar thing to yours and should be good as a starting point. Note that this code doesn’t do explicit memory management at all.

#include <cstdio>
#include <cstdlib>
#include <fstream>
#include <iostream>
#include <string>
#include <vector>

using std::string;
using std::vector;

struct InputRecord {
  double timestamp;
  double field2;
  double field3;
  double location;
  double field5;
};

vector<InputRecord> read_from_csv(const char *filename) {
  std::ifstream in(filename);
  vector<InputRecord> data;

  if (!in.is_open()) {
    throw std::ios_base::failure(string()
        + "cannot open input file \"" + filename + "\".");
  }

  string line;
  while (std::getline(in, line)) {
    InputRecord rec;
    char end_of_line;
    if (std::sscanf(line.c_str(), "%lf,%lf,%lf,%lf,%lf%c",
        &rec.timestamp, &rec.field2, &rec.field3,
        &rec.location, &rec.field5, &end_of_line) != 5) {
      throw std::ios_base::failure(string()
          + "input file \"" + filename + "\" "
          + "contains invalid data: \"" + line + "\"");
    }
    data.push_back(rec);
  }
  if (in.bad()) {
    throw std::ios_base::failure(string() + "error while reading data");
  }
  return data;
}

vector<double> calculate_speeds(const vector<InputRecord> &data) {
  vector<double> speeds;

  speeds.push_back(0.0);
  for (std::size_t i = 1; i < data.size(); i++) {
    double delta_t = data[i].timestamp - data[i - 1].timestamp;
    double delta_s = data[i].location - data[i - 1].location;
    speeds.push_back(delta_s / delta_t);
  }
  return speeds;
}

void write_to_csv(const char *filename, const vector<InputRecord> &data,
    const vector<double> &speeds) {
  std::ofstream out(filename);

  if (!out.is_open()) {
    throw std::ios_base::failure(string()
        + "cannot open output file \"" + filename + "\".");
  }
  for (std::size_t i = 0; i < data.size(); i++) {
    out << data[i].timestamp << "," << speeds[i] << "\n";
  }
  if (out.bad()) {
    throw std::ios_base::failure(string() + "error while writing data");
  }
}

int main() {
  vector<InputRecord> data = read_from_csv("in.csv");
  vector<double> speeds = calculate_speeds(data);
  write_to_csv("out.csv", data, speeds);
  return 0;
}
Roland Illig
  • 40,703
  • 10
  • 88
  • 121
  • Thank you for the detailed response. I will try this out today. – snickodonnell Dec 18 '14 at 14:01
  • Roland, I ran the code you provided (altering it to my needs) and it works like a charm. I am truly thankful for your help, and it was great to be able to learn new stuff like this. – snickodonnell Dec 18 '14 at 14:52