-4

Given a QString, I want to extract a substring from the main string input.

e.g. I have a QString reading something like:

\\\\?\\Volume{db41aa6a-c0b8-11e9-bc8a-806e6f6e6963}\\

I need to extract the string (if a string with the format exists) using a template/format matching a regex format (\w){8}([-](\w){4}){3}[-](\w){12} as shown below:

xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

and it should return

db41aa6a-c0b8-11e9-bc8a-806e6f6e6963

if found, else an empty QString.

Currently, I can achieve this by doing something like:

string.replace("{", "").replace("}", "").replace("\\", "").replace("?", "").replace("Volume", "");

But this is tedious and inefficient, and tailored to a specific request.

Is there a generalized function that enables me to extract a substring using a regex format or other?

Update

To clarity after @Emma's answer, I want e.g. QString::extract("(\w){8}([-](\w){4}){3}[-](\w){12}") which returns db41aa6a-c0b8-11e9-bc8a-806e6f6e6963.

CybeX
  • 2,060
  • 3
  • 48
  • 115

2 Answers2

2

Here's a bunch of ways to extract part of a string as presented in the question. I don't know how much of the string format is fixed vs. variable, so possibly not all of these examples would be practical. Also some examples below are using QStringRef class which can be more efficient but must have the original string (the one being referenced) available while any references are active (see warning in docs).

  const QString str("\\\\?\\Volume{db41aa6a-c0b8-11e9-bc8a-806e6f6e6963}\\");

  // Treat str as a list delimited by "{" and "}" chars.

  const QString sectResult = str.section('{', 1, 1).section('}', 0, 0);  // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
  const QString sectRxResult = str.section(QRegExp("\\{|\\}"), 1, 1);    // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"

  // Example using QStringRef, though this could also be just QString::split() which returns QString copies.
  const QVector<QStringRef> splitRef = str.splitRef(QRegExp("\\{|\\}"));
  const QStringRef splitRefResult = splitRef.value(1);  // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"

  // Use regular expressions to find/extract matching string

  const QRegularExpression rx("\\w{8}(?:-(\\w){4}){3}-\\w{12}");  // match a UUID string
  const QRegularExpressionMatch match = rx.match(str);
  const QString rxResultStr = match.captured(0);        // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
  const QStringRef rxResultRef = match.capturedRef(0);  // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"

  const QRegularExpression rx2(".+\\{([^{\\}]+)\\}.+");  // capture anything inside { } brackets
  const QRegularExpressionMatch match2 = rx2.match(str);
  const QString rx2ResultStr = match2.captured(1);       // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
  // Make a copy for replace so that our references to the original string remain valid.
  const QString replaceResult = QString(str).replace(rx2, "\\1");   // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"

  qDebug() << sectResult << sectRxResult << splitRefResult << rxResultStr
           << rxResultRef << rx2ResultStr << replaceResult;
Maxim Paperno
  • 4,485
  • 2
  • 18
  • 22
0

Maybe,

Volume{(\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b)}

or just,

\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b

for a full match might be a bit closer.


If you wish to simplify/update/explore the expression, it's been explained on the top right panel of regex101.com. You can watch the matching steps or modify them in this debugger link, if you'd be interested. The debugger demonstrates that how a RegEx engine might step by step consume some sample input strings and would perform the matching process.


RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Source

Searching for UUIDs in text with regex

Community
  • 1
  • 1
Emma
  • 27,428
  • 11
  • 44
  • 69
  • 1
    the regex is fine, I don't have a problem with the regex. I am looking for an efficient way of extracting a substring matching a regex from another string. – CybeX Nov 17 '19 at 19:27