To solve the problem at hand there is a more or less standard approach. First you analyze WHAT has to be done, then HOW it should be implemented, then do the IMPLEMENTATION and the test the UNIT and at the end test the full SYSTEM.
So, you have to read a CSV file, witht some dates and weather data in it. Then you want to operate on this data, do some calculation etc. The selected compiler language is C++.
Now, how could this to be done? C++ is an object oriented language. You can create objects, consisting of data and member functions that operate on this data. We will define a class "WeatherMeasurement" and overwrite the inserter and extractor operator. Because the class and only the class should know how this works. Having done that, input and output becomes easy.
Example:
WeatherMeasurement wm{};
// Do something
// . . .
//Output
std::cout << wm;
The extractor, and that is the core of the question is a little bit more tricky. How can this be done?
In the extractor we will first read a complete line from an std::istream
using the function std::getline
. After having the line, we see a std::string
containing "data-fields", delimited by a comma. The std::string
needs to be split up and the "data-fields"-contents shall be stored.
The process of splitting up strings is also called tokenizing. The "data-fields"-content is also called "token". C++ has a standard function for this purpose: std::sregex_token_iterator
.
And because we have something that has been designed for such purpose, we should use it.
This thing is an iterator. For iterating over a string, hence sregex. The begin part defines, on what range of input we shall operate, then there is a std::regex for what should be matched / or what should not be matched in the input string. The type of matching strategy is given with last parameter.
1 --> give me the stuff that I defined in the regex and
-1 --> give me that what is NOT matched based on the regex.
We can use this iterator for storing the tokens in a std::vector
. The std::vector
has a range constructor, which takes 2 iterators a parameter, and copies the data between the first iterator and 2nd iterator to the std::vector
.
The statement
std::vector tokens(std::sregex_token_iterator(textLine.begin(), textLine.end(), csvSeparator, -1), {});
defines a variable "tokens" of type std::vector<std::string>
, splits up the std::string
and puts the tokens into the std::vector
. After having the data in the std::vector
, we will copy it to the data members of our class.
Very simple.
Next step. We want to read from a file. The file conatins also some kind of same data. The same data are rows.
And as for above, we can iterate over similar data. If it is the file input or whatever. For this purpose C++ has the std::istream_iterator. This is a template and as a template parameter it gets the type of data that it should read and, as a constructor parameter, it gets a reference to an input stream. It doesnt't matter, if the input stream is a std::cin, or a std::ifstream or a std::istringstream. The behaviour is identical for all kinds of streams.
And since we do not have files an SO, I use (in the below example) a std::istringstream to store the input csv file. But of course you can open a file, by defining a std::ifstream csvFile(filename). No problem.
We can now read the complete csv-file and split it into tokens and get all data, by simply defining a new varible and use again the range constructor.
std::vector weatherData(std::istream_iterator<WeatherMeasurement>(csvFile), {});
This very simple one-liner will read the complete csv-file and do all the expected work.
Please note: I am using C++17 and can define the std::vector
without template argument. The compiler can deduce the argument from the given function parameters. This feature is called CTAD ("class template argument deduction").
Additionally, you can see that I do not use the "end()"-iterator explicitely.
This iterator will be constructed from the empty brace-enclosed initializer list with the correct type, because it will be deduced to be the same as the type of the first argument due to the std::vector constructor requiring that.
I added some functions in main to show you, how to operate on the data. All this function should also be put into a class, but I think, I already answered you basic question.
Please see:
#include <string>
#include <iostream>
#include <vector>
#include <iterator>
#include <regex>
#include <sstream>
#include <numeric>
const std::regex csvSeparator{ "," };
struct WeatherMeasurement {
// Data
std::string date{};
std::string time{};
std::string inout{};
double temperature{};
double humidity{};
// Overwrite inserter operator
friend std::ostream& operator << (std::ostream& os, const WeatherMeasurement& wm) {
return os << wm.date << " " << wm.time << " " << wm.inout << " " << wm.temperature << " " << wm.humidity;
}
// Overwrite extractor operator to read csv data
friend std::istream& operator >> (std::istream& is, WeatherMeasurement& wm) {
// We will read one line from the stream
std::string textLine{};
// Read the line and check, if it worked
if (std::getline(is, textLine)) {
// Split the line into tokens
std::vector tokens(std::sregex_token_iterator(textLine.begin(), textLine.end(), csvSeparator, -1), {});
// We expect 5 tokens for our private data
if (5 == tokens.size()) {
// Ok, data is available. Now put it into our private data members
wm.date = tokens[0];
wm.time = tokens[1];
wm.inout = tokens[2];
wm.temperature = std::stod(tokens[3]);
wm.humidity = std::stod(tokens[4]);
}
}
return is;
}
};
std::istringstream csvFile{ R"(2019/12/01,01:23:34,inout1,32,50
2019/12/01,01:23:35,inout2,33,51
2019/12/02,02:23:35,inout3,29,48
2019/12/03,03:23:35,inout4,28,47
2019/12/04,04:23:35,inout5,26,51
)" };
int main() {
// Read the complete csv file and get all weater measurements
std::vector weatherData(std::istream_iterator<WeatherMeasurement>(csvFile), {});
// Get the min and max temperature
const auto [min1, max1] = std::minmax_element(weatherData.begin(), weatherData.end(),
[](const WeatherMeasurement& wm1, const WeatherMeasurement& wm2) { return wm1.temperature < wm2.temperature; });
std::cout << "Min/Max Temperature: " << min1->temperature << " / " << max1->temperature << "\n";
// Get the min and max humidity
auto [min2, max2] = std::minmax_element(weatherData.begin(), weatherData.end(),
[](const WeatherMeasurement& wm1, const WeatherMeasurement& wm2) { return wm1.humidity < wm2.humidity; });
std::cout << "Min/Max Humidity: " << min2->humidity << " / " << max2->humidity << "\n";
// Average Temparature
std::cout << "\nAverage Temperature : "
<< static_cast<double>(std::accumulate(weatherData.begin(), weatherData.end(), 0,
[](double init, WeatherMeasurement& wm1) { return init + wm1.temperature; })) / weatherData.size() << "\n\n";
// Sort by temperature
std::sort(weatherData.begin(), weatherData.end(),
[](const WeatherMeasurement & wm1, const WeatherMeasurement & wm2) { return wm1.temperature < wm2.temperature; });
std::copy(weatherData.begin(), weatherData.end(), std::ostream_iterator<WeatherMeasurement>(std::cout, "\n"));
// Average Humidity
std::cout << "\n\nAverage Humidity : "
<< static_cast<double>(std::accumulate(weatherData.begin(), weatherData.end(), 0,
[](double init, WeatherMeasurement & wm1) { return init + wm1.humidity; })) / weatherData.size() << "\n\n";
// Sort by humidity
std::sort(weatherData.begin(), weatherData.end(),
[](const WeatherMeasurement & wm1, const WeatherMeasurement & wm2) { return wm1.humidity < wm2.humidity; });
std::copy(weatherData.begin(), weatherData.end(), std::ostream_iterator<WeatherMeasurement>(std::cout, "\n"));
return 0;
}