Extract image path from HTML in Objective C

Question

Help!!! I am fairly new to iPhone App development and I am caught up with parsing! I am trying to read the feeds from a URL which ends in .cms I was able to get the text from the source and remove the HTML using the flattenHTML code but I am having trouble with extracting the path for the image. The path for the image is in something like: ....(text+html)...><img src="http://www.... If anybody could please help and suggest how i can get the path for the image extracted... :((

Thanks in advance!

score 0 · Answer 1 · answered Jul 29 '10 at 10:24

0

You may apply a regular expression on your text to extract the path.

The pattern would be something like <img.*src?=?"(.*?)"

answered Jul 29 '10 at 10:24

Guillaume Lebourgeois

3,796
1
20
23

No. Don't. No ReGex for HTML. Use an XML parser. HTML is not a regular language and hence cannot be correctly parsed with RegEx, it is a subset of XML. – Jul 29 '10 at 10:27
1

What about single quotes? What about no quotes? What about whitespace? Newlines? – Jul 29 '10 at 10:29
I totally agree with you, except if he's having a "one shot problem", or working with only one feed, always generated the same way. If it's not the case, ReGex are not an option. – Guillaume Lebourgeois Jul 29 '10 at 11:52
Well if I am not wrong, XML parser works fine as long as it is a .xml file. This URL that I'm working with has a .cms extension that seems to be a little different than regular xml. When i extract the contents inside the tag pair, I get the description as well as some HTML with the image URL included in it. I am unable to extract it. – bangdel Jul 29 '10 at 13:32

Extract image path from HTML in Objective C

1 Answers1