Requirement : I have following data to match with regEX. I need to get Name 1, Name 2, Name 3 and Name 4.
Some Conditions :
$regex
need to consider thatName
will always come after<H2>Composition<\H2>
- There could be any number of
Name
i.e. It could happen that afterComposition
there is only one pattern sayName1
or two patternName1
andName2
. - At least one
Name
pattern will be present after Composition. So regex should be like "Composition is present then Name1 will be surely there"
Example:
<H2>Composition</H2>
<A href="/generics/levocetrizine-210129">Name 1</A>,
<A href="/generics/paracetamol-210459">Name 2(500 mg)</A>,
<A href="/generics/phenylephrine-hydrochloride-210494">Name 3</A>,
<A href="/generics/ambroxol-hydrochloride-211798">Name 4</A></DIV></DIV></DIV></DIV>
So far, I could only be able to get first Name
i.e. Name1
via following script. My script simply ignores rest of "Name" i.e. in above case, Name2, Name3 and Name4 are missing from my output.
[regex]$regex =
@'
(?s).+?<H2>Composition</H2>.*?href="/generics/.*?">(.*?)</A>
'@