I'm trying to write regex expressions to validate XML files and extract the strings stored between tags in C++.
This is one of the regex expressions I'm aiming for:
"<[^/]*?>"
This doesn't work however. Neither does something simpler like this:
"<[a-z]*>"
However, this produces a match:
"<.*>"
It doesn't seem like brackets are able to be matched.
Below is the relevant part of the code I'm using:
string testString = "<test>";
regex xmlRegOpenTag("<[^/]*?>", regex_constants::extended);
smatch smOpen;
cout << regex_match(testString, smOpen, xmlRegOpenTag) << endl;
string openCap = smOpen[0];
cout << "openCap: " << openCap << endl;
I've tried using other flags like regex_constants::basic, etc. Nothing seems to be working. I'm compiling using gcc version 4.7.3.
To those mentioning that I shouldn't be parsing XML using regex: I only need to parse XML files that I've created myself, so it isn't a problem.
I'm using the C++11 standard. In my header file, I'm including regex as such:
#include <regex>
using namespace std;
When using the first regex expression ("<[^/]*?>"), I get:
terminate called after throwing an instance of 'std::regex_error'
what(): regex_error
Abort
When using the second regex expression ("<[a-z]*>"), I get:
0
openCap:
When using the third regex expression ("<.*>"), I get:
1
openCap: <test>
This is the information I can provide about the compiler I'm using:
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.7/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.7.3-1ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs --enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.7 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --with-system-zlib --enable-objc-gc --with-cloog --enable-cloog-backend=ppl --disable-cloog-version-check --disable-ppl-version-check --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.7.3 (Ubuntu/Linaro 4.7.3-1ubuntu1)