3

Below test program uses the named captures support in boost-regex to extract year, month and day fields from a date (just to illustrate the use of named captures):

#include <boost/regex.hpp>
#include <boost/regex/icu.hpp>

#include <string>
#include <iostream>

int main(int argc, const char** argv)
{
   std::string   str = "2013-08-15";
   boost::regex  rex("(?<year>[0-9]{4}).*(?<month>[0-9]{2}).*(?<day>[0-9]{2})");
   boost::smatch res;

   std::string::const_iterator begin = str.begin();
   std::string::const_iterator end   = str.end();

   if (boost::regex_search(begin, end, res, rex))
   {
      std::cout << "Day:   " << res ["day"] << std::endl
                << "Month: " << res ["month"] << std::endl
                << "Year:  " << res ["year"] << std::endl;

   }
}

Compiled with

g++ regex.cpp -lboost_regex -lboost_locale -licuuc

This little program will produce the following output as expected:

$ ./a.out 
Day:   15
Month: 08
Year:  2013

Next I replace the ordinary regex parts with their u32regex counterparts:

#include <boost/regex.hpp>
#include <boost/regex/icu.hpp>

#include <string>
#include <iostream>

int main(int argc, const char** argv)
{
   std::string   str = "2013-08-15";
   boost::u32regex  rex = boost::make_u32regex("(?<year>[0-9]{4}).*(?<month>[0-9]{2}).*(?<day>[0-9]{2})", boost::regex_constants::perl);
   boost::smatch res;

   std::string::const_iterator begin = str.begin();
   std::string::const_iterator end   = str.end();

   if (boost::u32regex_search(begin, end, res, rex))
   {
      std::cout << "Day:   " << res ["day"] << std::endl
                << "Month: " << res ["month"] << std::endl
                << "Year:  " << res ["year"] << std::endl;

   }
}

Building an running the program now results in a run-time exception suggesting an uninitialized shared_ptr:

$ ./a.out 
a.out: /usr/include/boost/smart_ptr/shared_ptr.hpp:648: typename boost::detail::sp_member_access<T>::type boost::shared_ptr<T>::operator->() const [with T = boost::re_detail::named_subexpressions; typename boost::detail::sp_member_access<T>::type = boost::re_detail::named_subexpressions*]: Assertion `px != 0' failed.

I'm not using shared pointers directly though.

This is with boost 1.58.1 and gcc 5.3.1.

How can I get the u32regex version of the program running correctly as well ?

Geert Janssens
  • 166
  • 2
  • 5
  • Using u32 makes the engine convert all encoded strings to u32, so maybe smatch expects overloads m["..."] in u32, which is probably a bug. Assuming that is where the exception is, try using the regular group number as a test. Also, do it in debug mode, put a breakpoint at the cout << and using the stack trace, go to the code that generated the exception, put a breakpoint there, then rerun it. –  Feb 08 '16 at 18:36
  • Just for grins, make sure `BOOST_HAS_ICU` preprocessor is defined in your program. –  Feb 08 '16 at 18:47
  • @sin I added #if defined BOOST_HAS_ICU / cout << "got it"; / #endif to the top of the file. It doesn't print "got it", so I presume BOOST_HAS_ICU is not defined. I can define it manually while building (added -DBOOST_HAS_ICU). Then "got it" is printed. However the error message remains exactly the same. – Geert Janssens Feb 10 '16 at 13:50
  • @sin Also, replacing m["..."] with m[0] and friends makes the error go away. That's the workaround I'm currently using. However it makes my code more elaborate, because in my real code I work with several regexes and the positions of the matches I'm interested in do not always appear in the same order. I'll see if I can debug it as per your suggestions. – Geert Janssens Feb 10 '16 at 13:57

0 Answers0