0

I wanted to capture the string between braces after specific selector. For example I have string like:

<div id="text-module-container-eb7272147" class="text-module-container"><style>#123-module-container-eb7272147 p{text-color:#211E22;bgcolor:test;} #text-module-container-eb7272147 p{color:#211E1E;} #text-module-container-eb7272147 p{color:#123444;} </style>

And now if I give selector #123-module-container-eb7272147 p it should return text-color:#211E22;bgcolor:test;

I am able to get data between the braces but not with specific selector. This is tried code https://regex101.com/r/AESL8q/1

Rikesh
  • 26,156
  • 14
  • 79
  • 87
  • 1
    Try `/#123-module-container-eb7272147\s+p{([^}]+)}/i` – anubhava Mar 08 '22 at 07:22
  • "*I give selector #123-module-container-eb7272147 p it*" - "*it*" being? As tagged [tag:jquery] if you give that as a selector to jquery, it won't return any elements (given *only* the provided string) as there's no `p` element. – freedomn-m Mar 08 '22 at 07:25
  • Slightly simplified regex, but you will need to use groups, not matches: `#123-module-container-eb7272147 p{(.+?)}` https://regex101.com/r/obIoHN/1 – freedomn-m Mar 08 '22 at 07:29
  • @freedomn-m: It is almost same regex that I had posted in comment. It is just that `[^]]+` is lot more efficient than `.+?`. – anubhava Mar 08 '22 at 07:39
  • @anubhava sorry, no idea about efficiency (unlikely to matter unless 10mill+ searches), just *slightly* shorter and with a regex101 link – freedomn-m Mar 08 '22 at 07:52

1 Answers1

1

You can use a positive lookbehind with your selector and the opening brace then capture all chars which are not a closing brace and use a positive lookahead for the closing brace (optional):

/(?<=#123-module-container-eb7272147 p\{)[^}]+(?=\})/

  • The positive lookbehind is done with (?<= ).
  • For the selector, you'll have to escape some chars, typically if you have a class selector the dot should be escaped. The opening brace also.
  • The match you want between the braces is [^}]+ to say any char except the closing brace, once or more. Adding a question mark behind would make it ungreedy but I don't think it would be necessary. It would be the case if you use the dot to match anything.
  • The positive lookahead is done with (?= ).

You can test it here:

/**
 * Escape characters which have a meaning in a regular expression.
 * 
 * @param string The string you need to escape.
 * @returns The escaped string.
 */
function escapeRegExp(string) {
    return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

let button = document.querySelector('#extract');

button.addEventListener('click', function(event) {
  let html = document.querySelector('#html').value;
  let selector = document.querySelector('#selector').value;
  let pattern = new RegExp('(?<=' + escapeRegExp(selector) + '\s*\{)[^}]+(?=\})');
  let matches = pattern.exec(html);
  if (matches) {
    alert("The extracted CSS rules:\n\n" + matches[0]);
  }
  event.preventDefault();
});
html, body {
  font-family: Arial, sans serif;
  font-size: 14px;
}

fieldset {
  min-width: 30em;
  padding: 0;
  margin: 1em 0;
  border: none;
  display: flex;
}

label {
  margin-right: 1em;
  width: 6em;
}

input[type="text"],
textarea {
  width: calc(100% - 7em);
  min-width: 20em;
  margin: 0;
  padding: .25em .5em;
}

input[type="submit"] {
  margin-left:  7.1em;
  padding: .2em 1em;
}
<form action="#">
  <fieldset>
    <label for="selector">Selector: </label>
    <input type="text" id="selector" name="selector"
           value="#123-module-container-eb7272147 p">
  </fieldset>
  <fieldset>
    <label for="">HTML code:</label>
    <textarea id="html" name="html" cols="30" rows="10">&lt;div id=&quot;text-module-container-eb7272147&quot; class=&quot;text-module-container&quot;&gt;&lt;style&gt;#123-module-container-eb7272147 p{text-color:#211E22;bgcolor:test;} #text-module-container-eb7272147 p{color:#211E1E;} #text-module-container-eb7272147 p{color:#123444;} &lt;/style&gt;&lt;div style=&quot;background-color: rgb(168, 27, 219); color: rgb(33, 30, 30);&quot;&gt;&lt;span style=&quot;color:#3498db;&quot;&gt;Click the edit button to replace this conte&lt;/span&gt;nt with your own.&lt;/div&gt;&lt;/div&gt;</textarea>
  </fieldset>
  <fieldset>
    <input type="submit" id="extract" value="Extract the CSS rules">
  </fieldset>
</form>

Or play with it here: https://regex101.com/r/N5cVKq/1

Patrick Janser
  • 3,318
  • 1
  • 16
  • 18
  • Thanks Patrick, that works like charm. Any way by which I can dynamically replace `#123-module-container-eb7272147 p` part in pattern? – Rikesh Mar 08 '22 at 08:09
  • You can build your pattern with `let pattern = new RegExp(...)` and quickly create a function to escape your selector. See the solution here: https://stackoverflow.com/a/6969486/653182 – Patrick Janser Mar 08 '22 at 08:14