6

I'm trying to edit an open source C++ program to make a simple adjustment so that one of the input's accepts a regexp string instead of a string. I'm a complete C++ noob (never written anything) so I'm hoping someone can point me to a function that will work. Take the following code:

#include <iostream>
#include <string>

int main() {
    std::string str1("ABCDEABCABD");
    std::string pattern("A");

    int count1 = 0;

    size_t p1 = str1.find(pattern, 0);
    while(p1 != std::string::npos)
    {
        p1 = str1.find(pattern,p1+pattern.size());
        count1 += 1;
    }

    std::cout << count1 << std::endl;
}

I would like 'pattern' to accept a regular expression of several patterns seperated by the pipe sign, eg 'A|D' (which would output 5 in this case).

From what I gather from this C++ reference page, you cannot supply a regular expression like this to the string::find function. What function can I put here instead?

Zoe
  • 27,060
  • 21
  • 118
  • 148
Floris
  • 637
  • 1
  • 8
  • 17
  • Use `std::regex_search`. I am linking to the post showing the code returning multiple matches found in the input string. If you have any trouble implementing that, feel free to drop a comment. – Wiktor Stribiżew Mar 30 '16 at 21:05
  • Sorry I've never used C++ before so I'm having a really hard time here. I see that this function takes an object with a type of std::regex? But the pattern object is supplied as type string like this 'std::string pattern' in the function definition. Can I also loop through it using the p1 != std::string::npos? P.s. I also don't see how my question is a duplicate because the other question isn't about string::find? – Floris Mar 30 '16 at 21:18
  • What is your goal? What exactly do you need? Right now, you ask for a piece of code that will find and count all substrings in a string that match a specific regex pattern. The post I linked to shows how to do it (`std::sregex_iterator` solution looks best IMHO). What else? **`string::find` does not support regex**, so the post I linked to is a valid dupe source. – Wiktor Stribiżew Mar 30 '16 at 21:23
  • Yes, that's exactly what I'm asking. I see how your post might answer my question but I have no clue how to implement your code there into my problem as I've never looked at C++ code until today. For example how do I go from std::string to std::regex? – Floris Mar 30 '16 at 21:31
  • Please provide an [MVCE (minimal complete verifiable example)](http://stackoverflow.com/help/mcve). At least some input and expected output. – Wiktor Stribiżew Mar 30 '16 at 21:33
  • I'm struggling with this because I cannot get any code to work on the site I saw you linked in your other post (https://ideone.com/GVPhl9). Everything I try results in compilation error without message.. Input would be a string eg "ABCDEABCABD" and pattern string (has to be a string because thats the argument from the program, it could be converted to whatever if needed..) eg "A|D" and the output should be the count, in this case 5. – Floris Mar 30 '16 at 22:17
  • See [this demo](http://ideone.com/YzXgQN). – Wiktor Stribiżew Mar 30 '16 at 22:21
  • Thanks a lot, that's almost it. The only thing is that in the function I'm trying to change, "expression" is supplied as a std::string. How can I convert this to a std::regex? See here: http://ideone.com/JILmxf. If you want to make this an official answer I can accept it for you. Thanks. – Floris Mar 30 '16 at 22:27
  • I added an answer. Sorry, I ran out of votes for today, and I need to go to bed. I will upvote the question tomorrow. – Wiktor Stribiżew Mar 30 '16 at 22:35
  • Although yours is a valid question, take into account that [regular expressions are not a good solution most of the time](http://programmers.stackexchange.com/a/223640/197286). You should only use them if they're the right tool, and they aren't here. – 3442 Mar 30 '16 at 22:36
  • @WiktorStribiżew Thanks. I also updated the question to make it more clear (and managed to get working code on IDEONE) – Floris Mar 30 '16 at 22:42
  • @KemyLand what would be better? – Floris Mar 30 '16 at 22:42
  • @KemyLand if you have solution in mind that does not use regex I can ask another question for you to answer. Since I'm doing this several million (billion) times I'm noticing a clear performance decrease using regex vs while loop + string::find. – Floris Mar 31 '16 at 23:45
  • Multiple threads of this code have been running for 12+ hours and are still working, so I'm now looking for a faster solution. New question here: http://stackoverflow.com/questions/36360005/c-multiple-string-matching-without-using-regex – Floris Apr 01 '16 at 15:22

1 Answers1

5

You may leverage the following C++ code:

#include <iostream>
#include <regex>
using namespace std;

int main() {
    std::string pattern("A|D");         // Regex expression
    std::regex rx(pattern);             // Getting the regex object 

    std::string s("ABCDEABCABD");       // Defining the string input
    std::ptrdiff_t number_of_matches = std::distance(  // Count the number of matches inside the iterator
        std::sregex_iterator(s.begin(), s.end(), rx),
        std::sregex_iterator());

    std::cout << number_of_matches << std::endl;  // Displaying results
    return 0;
}

See IDEONE demo

Note that:

  • If pattern can contain literal strings with special characters they might need escaping.
  • std::distance is a function that returns the number of elements between first and last, the number of elements the iterator yields.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • what if I want the matched string as well? – psykid Dec 10 '20 at 20:30
  • @psykid Then you should have a look at the right thread, like [How to match multiple results using std::regex](https://stackoverflow.com/questions/21667295/how-to-match-multiple-results-using-stdregex) – Wiktor Stribiżew Dec 10 '20 at 20:33