-1

I just started learning about Regexes and can't figure out how to lift Gizmo from the HTML tag

<meta content="Gizmo" property="og:title" />

I'm stuck at the (?<Name>meta content=), which is basically nothing, but I don't know what to do from there.

exlo
  • 315
  • 1
  • 8
  • 20

1 Answers1

2

It's well known you shouldn't use regex to parse html (actually, it's been said millon times), you should use a html parser instead.

On the other hand, if you want to use regex for this... you are pretty close, you have to use:

(?<Name>meta content=".*?")

Btw, if you want to grab the word Gizmo you have to use capturing groups also withing your group Name

(?<Name>meta content="(.*?)")

Working demo

On the other hand, if you don't care about capturing meta content and you just want to capture the content within content, you can use use:

content="(?<Name>.*?)"

Working demo 2

Federico Piazza
  • 30,085
  • 15
  • 87
  • 123
  • Thanks! But it looks like it also extracts `meta content="` and `"` along with Gizmo? – exlo May 10 '15 at 23:54
  • @exlo look at the demo link, you can find the group index 2 having gizmo – Federico Piazza May 10 '15 at 23:55
  • I'm sorry, I'm new so I'm a little confused. If `Name Gizmo` is what I want, let's say using a simple demo like http://rubular.com/, then the match is `meta content="Colony Cafe"`. Is it possible to skip `"meta content="` and `"` all together? And it's for an assignment so I have to use regexes unfortunately – exlo May 10 '15 at 23:59
  • Got it! Thank you. But if I'm looking at a source where there are multiple meta content tags, and the identifier is `"og:title"`, would I make it `content="(?.*?)" property="og:title"` ? I tried it, and it's returning no matches – exlo May 11 '15 at 00:02
  • @exlo Yes, it will work only if `og:title` comes after content, otherwise it won't work. – Federico Piazza May 11 '15 at 00:04