From the following text I want to extract the number
and the unit of measurement
.
I have 2 possible cases:
This is some text 14.56 kg
and some other text
or
This is some text kg 14.56
and some other text
I used |
to match the both cases.
My problem is that it produces empty submatches, and thus giving me an incorrect number of matches.
This is my code:
std::smatch m;
std::string myString = "This is some text kg 14.56 and some other text";
const std::regex myRegex(
R"(([\d]{0,4}[\.,]*[\d]{1,6})\s+(kilograms?|kg|kilos?)|s+(kilograms?|kg|kilos?)(\s+[\d]{0,4}[\.,]*[\d]{1,6}))",
std::regex_constants::icase
);
if( std::regex_search(myString, m, myRegex) ){
std::cout << "Size: " << m.size() << endl;
for(int i=0; i<m.size(); i++)
std::cout << m[i].str() << std::endl;
}
else
std::cout << "Not found!\n";
OUTPUT:
Size: 5
kg 14.56
kg
14.56
I want an easy way to extract those 2 values, so my guess is that I want the following output:
WANTED OUTPUT:
Size: 3
kg 14.56
kg
14.56
This way I can always directly extract 2nd and 3th, but in this case I would also need to check which one is the number. I know how to do it with 2 separate searches, but I want to do it the right way, with a single search without using c++ to check if a submatch is an empty string.