1

This is a weird question I'm aware, but I am terrible at writing regex's.

The problem is fairly simple, I have a bunch of plain text coming in. And in that text are mentions of React Components.

For example:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris fringilla maximus, sed < HelloThere /> velit porttitor sed. Fusce lacinia bibendum eros, a ultricies leo sodales eget.

I need to create a regex that allows me to extrapolate that unknown react component so I can then wrap it with some mark-up automatically.

So the regex in the above example would return: "< HelloThere />"

The tricky part is it can be any React component. The component can also have props and children. This is an example of something in there as well: < Component>< Box>< Inline>< Text>Hello</ Text></ Inline></ Box></ Component>

So my initial idea was to try and find the opening "<" and then the closing "/>" and get everything in between. But I have not real clue how to go about actually doing that.

Any help is much appreciated!

PS Added spaces after the first angled bracket so SO doesn't try to mess with it

Edit:

So It's becoming clear to me that Regex might be too limited for this. I might need to figure out a clever JavaScript way or, add some tag or symbol at the beginning and end of the component which allows me to look it up more easily

jansmolders86
  • 5,449
  • 8
  • 37
  • 51

2 Answers2

1

This regex will match all valid components and if the component has children it will match only the open tag.

<[A-Z]\w*\b.*?>|(?<=>)(\w+)(?=<)

A valid component name starts with a capital letter and is continued with any count of word characters. Also, there can be some properties till the end of the tag (the > sign).

See the demo

JavaScript Example

let jsx = `<Component><Box><Inline><Text>Hello</Text></Inline></Box></Component>`;

console.log(jsx.match(/<[A-Z]\w*\b.*?>|(?<=>)(\w+)(?=<)/g))  // ["<Component>", "<Box>", "<Inline>", "<Text>", "Hello"]
Artyom Vancyan
  • 5,029
  • 3
  • 12
  • 34
0

Thanks for all the effort and discussion! Really appreciate it!

After discussing with a colleague of mine as well, we compromised in that we will keep a list of the possible components that could be used in the text. With that compromise, things become a whole lot simpler, and we don't have to use regex at all (Provided we do this client side)

Say we have this data

 const string = `Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer ut nulla et ligula consequat volutpat. In nec mauris nec dolor dapibus mollis in ut nibh. 
    <AComponent prop="value">Hello<p> there</p></AComponent>
    Nam vulputate, sem vitae sollicitudin hendrerit, nibh mauris semper odio, ut gravida leo lacus sodales urna. Mauris nisl augue, elementum non ultricies et, semper quis diam. Integer vel fermentum ante.`

Then I can create html from that string on the fly:

const htmlObject = document.createElement('div');
htmlObject.innerHTML = string;

And then simply use vanilla JS to loop through the tags and see if I find any (Doing the one to keep it simple)

const find = htmlObject.getElementsByTagName("AComponent")

console.log('find', find) 

const string = `Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer ut nulla et ligula consequat volutpat. In nec mauris nec dolor dapibus mollis in ut nibh. 
    <AComponent prop="value">Hello<p> there</p></AComponent>
    Nam vulputate, sem vitae sollicitudin hendrerit, nibh mauris semper odio, ut gravida leo lacus sodales urna. Mauris nisl augue, elementum non ultricies et, semper quis diam. Integer vel fermentum ante.`

const htmlObject = document.createElement('div');
htmlObject.innerHTML = string;

const find = htmlObject.getElementsByTagName("AComponent")


alert(find) 
jansmolders86
  • 5,449
  • 8
  • 37
  • 51