1

I'm reading in a CSV that has 3 columns. On each column I need to perform the mean, var, and std calculations. I'm able to get the output for the first column but dont know how to have it print all 3 columns. Thanks.

I tried adding ',' after line in while (getline(inNew, line, ','))
but that doesnt work for me

int main()
{
    ifstream inNew("C:/Users/A.csv");
    accumulator_set<double, stats<tag::mean, tag::variance >> acc;
    if (inNew)
    {
        string line;
        while (getline(inNew, line))
        {
            acc(stod(line));
        }
        cout << "Expected return is: " << mean(acc) << std::endl;
        cout << "Variance: " << variance(acc) << std::endl;
        cout << "Std Dev: " << sqrt(variance(acc)) << std::endl;
    }

    inNew.close();

    system("pause");
    return 0;
}
user2942358
  • 11
  • 1
  • 3

1 Answers1

1

Since you're already using boost, use boost::split to split each line into its columns. Then accumulate each column separately. You'll need an accumulator_set for each column.

Code might look something like this:

#include <fstream>
#include <iostream>
#include <string>
#include <vector>

#include <boost/accumulators/accumulators.hpp>
#include <boost/accumulators/statistics/stats.hpp>
#include <boost/accumulators/statistics/mean.hpp>
#include <boost/accumulators/statistics/variance.hpp>
#include <boost/algorithm/string.hpp>

int main()
{
    using namespace std;
    using namespace boost;
    using namespace boost::accumulators;

    ifstream inNew("C:/Users/A.csv");
    size_t columns = 3;
    vector<accumulator_set<double, stats<tag::mean, tag::variance>>> acc(columns);

    if (inNew)
    {
        string line;
        while (getline(inNew, line))
        {
            vector<string> strs;
            split(strs, line, is_any_of("\t ,"));
            if (strs.size() == columns)
            {
                for (size_t i = 0; i < columns; ++i)
                {
                    acc[i](stod(strs[i]));
                }
            }
        }

        for (size_t i = 0; i < columns; ++i)
        {
            cout << "Stats for column " << (i + 1) << endl;
            cout << "Expected return is: " << mean(acc[i]) << endl;
            cout << "Variance: " << variance(acc[i]) << endl;
            cout << "Std Dev: " << sqrt(variance(acc[i])) << endl;
        }
    }

    inNew.close();

    system("pause");
    return 0;
}

Of course, you could make this more fancy and robust by not hardcoding the number of columns.

kersson
  • 23
  • 3
  • +1 You did what I'd have done, only with Spirit. Well, I forgive you :) – sehe Nov 18 '14 at 20:35
  • But here's is more dynamic using Boost Spirit **[Live On Coliru](http://coliru.stacked-crooked.com/a/f03e959c2aa1d5da)**. See [How to parse csv using boost::spirit](http://stackoverflow.com/questions/18365463/how-to-parse-csv-using-boostspirit/18366335#18366335) for more advanced CSV support – sehe Nov 18 '14 at 20:52
  • When trying IceSlicer code, VisualStudio gives an error. Why?Error 1 error C4996: 'std::_Copy_impl': Function call with parameters that may be unsafe - this call relies on the caller to check that the passed values are correct. To disable this warning, use -D_SCL_SECURE_NO_WARNINGS. See documentation on how to use Visual C++ 'Checked Iterators' c:\program files (x86)\microsoft visual studio 12.0\vc\include\xutility – user2942358 Nov 18 '14 at 21:28
  • @user2942358 It's a harmless warning that originates from within `boost::is_any_of`. See [here](http://stackoverflow.com/questions/1301277/c-boost-whats-the-cause-of-this-warning) and [here](http://stackoverflow.com/questions/14141476/warning-with-boostsplit-when-compiling). – kersson Nov 18 '14 at 22:09