4

I would like to change the font-size in an HTML string that I have to be half of its size.

E.g.

<div style="font-family:'Arial';font-size:43px;color:#ffffff;">

Will be

<div style="font-family:'Arial';font-size:21.5px;color:#ffffff;">

and

<div style="font-size:12px;">

Will be

<div style="font-size:6px;">

How can I do it with NSRegularExpression?

Please note that 12 and 6 and 43 and 21.5 are only examples. I need regex since it has to be a general solution for different font-size

Dejell
  • 13,947
  • 40
  • 146
  • 229

4 Answers4

4

Use a real HTML parser to preserve your sanity. An XML parser for this is incredibly fragile. There are a dozen different perfectly valid variants of the HTML syntax that will break NSAddict's expression.

I suggest reading the top voted answer on this question as it applies equally as well to HTML as it does to XHTML or XML.

RegEx match open tags except XHTML self-contained tags

Note that the iOS / OS X system frameworks include HTML/XML parsing capabilities. Use those.

Community
  • 1
  • 1
bbum
  • 162,346
  • 23
  • 271
  • 359
  • You're right, a HTML parser would definitely be the better way to go. I just tried answering the question with `NSRegularExpression`, which is what she has asked for. – IluTov Dec 24 '12 at 18:23
  • 1
    When at sea, it is best to teach a person how to fish in response to the question, "Where's the beef?" ;) – bbum Dec 24 '12 at 18:28
  • Can you give an example to a HTML parser that I can use?I tried to look but couldn't find – Dejell Dec 24 '12 at 20:24
  • NSXMLDocument can be configured to handle the looseness of HTML. Lower level, libxml2 can do the same. Search developer.apple.com for the former as there are some good, if odd (an HTML store for CoreData?!?), examples. – bbum Dec 24 '12 at 20:27
  • I read the discussion. But I still think that for my case parsing it with regex is quicker. – Dejell Dec 25 '12 at 08:33
3

You can do this with NSString itself, it's pretty easy actually.

[string stringByReplacingOccurrencesOfString:@"font-size:12px;" withString:@"font-size:6px;"];

Copy this function

- (NSString *)setFontSize:(int)fontSize inHTMLString:(NSString *)htmlString {
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"font-size:[0-9]+px;" options:NSRegularExpressionCaseInsensitive error:nil];
    NSString *newString = [regex stringByReplacingMatchesInString:htmlString options:NSRegularExpressionCaseInsensitive range:NSMakeRange(0, htmlString.length) withTemplate:[NSString stringWithFormat:@"font-size:%dpx;", fontSize]];

    return newString;
}
IluTov
  • 6,807
  • 6
  • 41
  • 103
  • 1
    I want it to be a general solution. X/2 and not only 12 and 6 – Dejell Dec 24 '12 at 16:08
  • @Odelya Just seen it, I'm on it :) – IluTov Dec 24 '12 at 16:10
  • @Odelya Do you happen to know the regex to get it? – IluTov Dec 24 '12 at 16:16
  • No. This was part of my question – Dejell Dec 24 '12 at 16:20
  • font size should be dynamic. I don't know what is the font size.. It should be half the size of the found size – Dejell Dec 24 '12 at 16:28
  • That'll also "helpfully" muck with "the syntax is 'font-size:12px'" and about a zillion other perfectly valid variants on the syntax will trip that regex either with a false positive or false negative. Use an XML/XHTML/HTML parser if you want a solution that is robust and maintainable. – bbum Dec 24 '12 at 18:20
3

I am a bit reluctant to give an answer using regular expressions, because it has been stated repeatedly that parsing HTML with regex is considered harmful, impossible, dangerous to your mind, etc. And all that is correct, it is not my intention to claim anything different.

But even after all that warnings, OP has explicitly asked for a regex solution, so I am going to share this code. It can at least be useful as an example how to modify a string by looping over all matches of a regular expression.

NSString *htmlString =
    @"<div style=\"font-family:'Arial';font-size:43px;color:#ffffff;\">\n"
    @"<div style=\"font-size:12px;\">\n";

NSRegularExpression *regex;
regex = [NSRegularExpression regularExpressionWithPattern:@"font-size:([0-9]+)px;"
                                                  options:0
                                                    error:NULL];

NSMutableString *modifiedHtmlString = [htmlString mutableCopy];
__block int offset = 0;
[regex enumerateMatchesInString:htmlString
                        options:0
                          range:NSMakeRange(0, [htmlString length])
                     usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
                         // range = location of the regex capture group "([0-9]+)" in htmlString:
                         NSRange range = [result rangeAtIndex:1];
                         // Adjust location for modifiedHtmlString:
                         range.location += offset;
                         // Get old point size:
                         NSString *oldPointSize = [modifiedHtmlString substringWithRange:range];
                         // Compute new point size:
                         NSString *newPointSize = [NSString stringWithFormat:@"%.1f", [oldPointSize floatValue]/2];
                         // Replace point size in modifiedHtmlString:
                         [modifiedHtmlString replaceCharactersInRange:range withString:newPointSize];
                         // Update offset:
                         offset += [newPointSize length] - [oldPointSize length];
                     }
 ];

NSLog(@"%@", modifiedHtmlString);

Output:

<div style="font-family:'Arial';font-size:21.5px;color:#ffffff;">
<div style="font-size:6.0px;">
Martin R
  • 529,903
  • 94
  • 1,240
  • 1,382
  • It doesn't look good to me. I think that there is a way to use the first found group $1 instead of range. I could accept this answer - but only recently I found out about this $1 in the documentation of NSRegularExpression and since it's a public place it's better if you edit your answer before – Dejell Jan 03 '13 at 10:15
  • @Odelya: I am willing to improve my answer, but I do not see how `$1` can be used here. `$0`, `$1` are used by `stringByReplacingMatchesInString`, which is described in the documentation as "simple method for performing find-and-replace operations on a string". You can use them in the `withTemplate` argument of this method, but I do not see how to perform arithmetic with them. With `enumerateMatchesInString` you can perform arbitrary operations on all matches, but that function uses `NSTextCheckingResult`. `[result rangeAtIndex:1]` corresponds to `$1` (the result of the first capture group). – Martin R Jan 03 '13 at 10:33
  • @Odelya: I have read the NSRegularExpression documentation once more. Most methods return ranges or NSTextCheckingResult objects. Only the "find-and-replace" methods `stringByReplacingMatchesInString` and `replaceMatchesInString` work with templates, using `$0` etc. - I would like to give an answer that satisfies you, but I do not see how to solve the task using `$1` instead of ranges. – Martin R Jan 03 '13 at 18:51
1

I would use DTCoreText for that. It parses this HTML for you and constructs an attributed string. Then you can adjust the font to your liking. Finally you can either draw the attributed string with DTCoreText, or convert it back to HTML.

If you insist on HTML, then I can offer DTHTMLParser which is a SAX-based HTML parser based on libxml2. This can parse any HTML. Though you still would have to split apart the CSS which is not as straightforward as you might think, even with RegEx. I have a category on NSString which splits the parameters so that you can reconstitute the style with modified values.

Having said that, you are probably best served by my first recommendation.

Cocoanetics
  • 8,171
  • 2
  • 30
  • 57
  • Where can I download DTHTMLParser? – Dejell Jan 07 '13 at 07:57
  • As well, can I also create a new HTML record with the parser or it's just for reading? – Dejell Jan 07 '13 at 08:05
  • You can get it as part of DTFoundation: https://github.com/cocoanetics/dtfoundation – Cocoanetics Jan 07 '13 at 13:46
  • It's only for reading, but you can easily create a new HTML file in the delegate methods that include your modifications. – Cocoanetics Jan 07 '13 at 13:46
  • How would I create the HTML file? which tool do I need to use? – Dejell Jan 07 '13 at 15:40
  • It depends on what features you need. If you want to have it look like the original then you build it yourself in the DTHTMLParser delegate callbacks. – Cocoanetics Jan 08 '13 at 06:24
  • Build yourself - with NSMutableString? – Dejell Jan 08 '13 at 09:05
  • Exactly. When you get an opening tag you add a tag opening (and attributes) to the mutable string. When you get characters you append these. When you get a closing tag you add that, and so on and so forth. – Cocoanetics Jan 08 '13 at 15:08
  • I disagree with this method. Originally I am a java programmer. It doesn't look like a neat way to me and I don't see any reason in this case why to use this solution and not NSXMLParser – Dejell Jan 08 '13 at 16:27