0

i have a bit of an unusual requirement for replacing a string using regex... bear with me.

I have an input string...

Input

<section className={some-class}>

Un-touched stuff here

</section>

<hr />

... and i'd like to replace parts of the string so the output ends up like this...

Output

<!-- some-class -->

Un-touched stuff here

<hr />

some-class could be anything so i need to match and replace either side of the section, class name stuff.

Also the <hr /> represents any other html which i also don't want to touch.

I have the below so far but it's not quite right because it also matches the < and /> around the <hr />

RegEx

\<section className\=|\{|\}|\<|\/|section|\>
Paul
  • 151
  • 10
  • This might be sufficient [`
    ]*>([\s\S]*?)<\/section>`](https://regex101.com/r/hbkLp8/2) if the untouched stuff does not contain nested section tags.
    – wp78de Mar 11 '20 at 10:52
  • You might want to [read about](https://stackoverflow.com/a/1732454) the limitations of regex when applied to HTML/XML. – Scott Sauyet Mar 11 '20 at 13:37
  • Can `
    `s be nested? If so, can they be deeply nested? Can they have additional attributes besides `className`?
    – Scott Sauyet Mar 11 '20 at 13:45

3 Answers3

1

I hope the following helps you out:

Test Regex here.

  1. < matches <
  2. [^ ]* matches everything before the next space (in this case it matches section)
  3. [^=]*?= matches everything up until and including the next =
  4. { matches {
  5. ([^}]*?) matches and captures everything up until the next }
  6. }> matches }>
  7. ([^<]*) matches everything up until the next <
  8. <\/ matches <\/
  9. \1 matches the captured group from 5 (section)
  10. > matches >
  11. \s* matches all whitespace characters

let str = `<section className={CLASS_A}>

  Un-touched stuff here

</section>

<hr />

<section className={CLASS_B}>

  Un-touched stuff here

</section>`;
let reg = /<([^ ]*)[^=]*?={([^}]*?)}>([^<]*)<\/\1>\s*/g
console.log(str.replace(reg, "<!-- $2 -->$3"));
  • It wasn't me who downvoted, but I might have, as link-only answers are very much frowned on here. https://stackoverflow.com/help/how-to-answer – Scott Sauyet Mar 11 '20 at 13:33
  • I put a bunch of code there too. Js is not my native tongue, but I think this should do. – Harm van der Wal Mar 11 '20 at 13:39
  • Code-only answers are also frowned upon, I'm afraid. A good answer will often have some useful code, but should always have some explanatory text demonstrating how and why your technique solves the problem. – Scott Sauyet Mar 11 '20 at 13:43
0

Regex: /<section\s+className={([^}]+)}\s*>([\s\S]*)<\/section>\s*/

See Regex Demo

  1. <section Match <section
  2. \s+ Match one or more whitespace characters
  3. className= Match className=
  4. { Match {
  5. ([^}]+) Capture Group 1: one or more non-} characters
  6. } Match }
  7. \s* Zero or more whitespace characters
  8. > Match >
  9. ([\s\S]*) Capture Group 2: zero more characters of any type (whitespace or non-whitespace)
  10. <\/section> Match </section>
  11. \s* Match zero or more whitespace characters

let str = `<section className={xxxx}>

Un-touched stuff here

</section>

<hr />
`;

let regex = /<section\s+className={([^}]+)}\s*>([\s\S]*)<\/section>\s*/;
console.log(str.replace(regex, '<!-- $1 -->\n$2'));
Booboo
  • 38,656
  • 3
  • 37
  • 60
0

Apologies i missed two vital parts of information about my problem.

  • The input will actually have multiple section tags
  • The className is an enum so it will be inside curly braces
enum CLASS_A = "class-a"
enum CLASS_B = "class-b"

Input

<section className={CLASS_A}>

  Un-touched stuff here

</section>

<hr />

<section className={CLASS_B}>

  Un-touched stuff here

</section>

I've referenced both the above solutions and thanks to @Booboo and @wp78de and with some small adjustments the below regex seems to solve the problem.

https://regex101.com/r/hbkLp8/4

If there's a neater way please do share.

/<section\s+className\=\{([^>]*)\}>([\s\S]*?)<\/section>\s*/gm

Output

<!-- class-a -->


Un-touched stuff here

<hr />

<!-- class-b -->


Un-touched stuff here

<hr />

Paul
  • 151
  • 10