1

This question concerns the parsing of values in a Boost::program_options configuration file.

I have a simple custom data structure:

struct Vector {
    double x, y, z;
};

I have an istream deserialiser for the format "(x, y, z)" that I borrowed from another SO post:

// https://codereview.stackexchange.com/a/93811/186081
struct Expect {
    char expected;
    Expect(char expected) : expected(expected) {}
    friend std::istream& operator>>(std::istream& is, Expect const& e) {
        char actual;
        if ((is >> actual) && (actual != e.expected)) {
            is.setstate(std::ios::failbit);
        }
        return is;
    }
};

template<typename CharT>
std::basic_istream<CharT> &
operator>>(std::basic_istream<CharT> &in, Vector &v) {
    in >> Expect('(') >> v.x
       >> Expect(',') >> v.y
       >> Expect(',') >> v.z
       >> Expect(')');
    return in;
}

I am using an instance of Vector as a value store for Boost::program_options:

Vector vector {0.0, 0.0, 0.0};
po::options_description opts("Usage");
opts.add_options()
      ("vector", po::value(&vector), "The vector");

po::variables_map vm;
po::store(po::parse_config_file("config.cfg", opts, true), vm);
po::notify(vm);

The problem is that the configuration file format does not work if the vector value representation contains spaces. For example, this config file parses correctly:

vector = (0.0,1.1,2.2)

However this, with spaces, does not parse:

vector = (0.0, 1.1, 2.2)

Instead, program_options throws:

the argument ('(0.0, 1.1, 2.2)') for option 'vector' is invalid

However, for options that are declared as std::string, spaces seem to be OK:

some_string = this is a string

I found a few posts that mentioned using quotes, however this doesn't seem to work (same error):

vector = "(0.0, 1.1, 2.2)"

A few other posts suggest custom parsers, however I'm not sure how I'd go about implementing this, and it seems like a lot of work just to handle a few spaces.

I assume this behaviour comes from the way command-line options are parsed, even though this is config-file parsing. In this case, a command line like --vector (0.0, 1.1, 2.2) would not make much sense (ignoring the use of shell-reserved characters ( & ) for now)

Is there a good way to handle this?

davidA
  • 12,528
  • 9
  • 64
  • 96

2 Answers2

1

No you can't..

Edit: After second thought I think you can try modify the delimiter as in https://en.cppreference.com/w/cpp/locale/ctype

program_options make use of lexical_cast which requires the whole content is consumed after the operator>> . When there is space the content can never be consumed by one >>, by default and so the error.

Hence you can do something like:

    struct Optional {
        char optional;
        Optional(char optional):optional(optional){}
        friend std::istream& operator>>(std::istream& is, Optional const& o) {
            char next;
            do{
                next = is.peek();
            }while(next == o.optional && is.get());
            return is;
        }
    };

    struct vector_ctype : std::ctype<wchar_t> {
        bool do_is(mask m, char_type c) const {   
            if ((m & space) && c == L' ') {
                return false; // space will NOT be classified as whitespace
            }
            return ctype::do_is(m, c); // leave the rest to the parent class
        } 
    };


    template<typename CharT>
    std::basic_istream<CharT> &
    operator>>(std::basic_istream<CharT> &in, Vector &v) {    
        std::locale default_locale = in.getloc();
        in.imbue(std::locale(default_locale, new vector_ctype()));
        in >> Expect('(') >> Optional(' ') >> v.x >> Optional(' ')
           >> Expect(',') >> Optional(' ') >> v.y >> Optional(' ')
           >> Expect(',') >> Optional(' ') >> v.z >> Optional(' ')
           >> Expect(')');
        in.imbue(default_locale);
        return in;
    }
int main()
{
    Vector v  = boost::lexical_cast<Vector>("(1,  2,  3)");
    std::cout << v.x <<"," <<v.y <<"," << v.z <<std::endl;
}

Output:

1,2,3

This should gives you correct output in program_options

nelson.l
  • 66
  • 2
  • Do you know why it works for `std::string` options though? If `vector` is a `std::string` then `vector = (1, 2, 3)` is converted to `"(1, 2, 3)"`, with spaces intact. I was thinking of using this fact to treat the option as a string and then parse it later. – davidA Nov 19 '19 at 21:39
  • @davidA Because internally it is overloaded. Copy for `std::string` (and arrays), and stream for generic type. But I think you can try modifying the delimiter for the istream: https://en.cppreference.com/w/cpp/locale/ctype – nelson.l Nov 20 '19 at 09:03
  • Revisiting this a few years later: this seems to work well, but the bit that puzzles me is that I don't seem to need the `im.imbue()` with locale + custom facet - just the use of the `Optional` struct seems to be sufficient to skip spaces. Is that intended @nelson.l? – davidA Aug 13 '21 at 00:12
1

Instead of writing a custom parser, you can write a routine that filters the unwanted characters from the input file, and then pass it to boost::program_options for parsing.

Using a SO answer to Remove spaces from std::string in C++, here is a working example based on your code:

#include <iostream>
#include <sstream>
#include <fstream>
#include <string>
#include <boost/program_options/options_description.hpp>
#include <boost/program_options/parsers.hpp>
#include <boost/program_options/variables_map.hpp>

struct Vector {
  double x, y, z;
};

// https://codereview.stackexchange.com/a/93811/186081
struct Expect {
    char expected;
    Expect(char expected) : expected(expected) {}
    friend std::istream& operator>>(std::istream& is, Expect const& e) {
        char actual;
        if ((is >> actual) && (actual != e.expected)) {
            is.setstate(std::ios::failbit);
        }
        return is;
    }
};

template<typename CharT>
std::basic_istream<CharT> &
operator>>(std::basic_istream<CharT> &in, Vector &v) {
    in >> Expect('(') >> v.x
       >> Expect(',') >> v.y
       >> Expect(',') >> v.z
       >> Expect(')');
    return in;
}

std::stringstream filter(const std::string& filename)
{
  std::ifstream inputfile(filename);

  std::stringstream s;
  std::string line;
  while (std::getline(inputfile, line))
    {
      auto end_pos = std::remove(line.begin(), line.end(), ' ');
      line.erase(end_pos, line.end());
      s << line << '\n';
    }
  s.seekg(0, std::ios_base::beg);

  return s;
}

int main()
{
  namespace po = boost::program_options;

  po::options_description opts("Usage");
  opts.add_options()
    ("vector", po::value<Vector>(), "The vector");

  auto input = filter("config.cfg");
  po::variables_map vm;
  po::store(po::parse_config_file(input, opts, true), vm);
  po::notify(vm);

  auto read_vec = vm["vector"].as<Vector>();
  std::cout << "vector is : {" << read_vec.x << ", " << read_vec.y << ", " << read_vec.z << "}" << std::endl;

  return 0;
}

To test the program, you have to create a file config.cfg containing, for example:

vector = (1,  2, 3)

(Extra spacings are on purpose, to test the routine).

With this, the output of the program is

vector is : {1, 2, 3}
francesco
  • 7,189
  • 7
  • 22
  • 49