1
>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(2009&nbsp;RX7)</font></td>
>monospace" size="-1">214869&nbsp;(2007&nbsp;PAZ)</font></td>
>monospace" size="-1">&nbsp;&nbsp;4155&nbsp;Accord</font></td>

I wonder if someone could offer me a little help, I have a list of NSString items (See Above) that I want to parse some data from. My problem is that there are no tags that I can use within the strings nor do the items I want have fixed positions. The data I want to extract is:

2009 RX7
2007 PAZ
4155 Accord

My thinking is that its going to be easier to parse from the right hand end, remove the </font></td> and then use ";" to separate the data items:

(2009&nbsp RX7)
(2007&nbsp PAZ)
4155&nbsp Accord

which can them be cleaned up to match the example given. Any pointers on doing this or working through from the right would be very much appreciated.

fuzzygoat
  • 26,573
  • 48
  • 165
  • 294

4 Answers4

1

Try this code:

NSString *str = @">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(2009&nbsp;RX7)</font></td>";
NSRange fontRange = [str rangeOfString:@"</Font>" options:NSBackwardsSearch];
NSRange lastSemi = [str rangeOfString:@";" options:NSBackwardsSearch range:NSMakeRange(0, fontRange.location-1)];
NSRange priorSemi = [str rangeOfString:@";" options:NSBackwardsSearch range:NSMakeRange(0, lastSemi.location-1)];
NSString *yourString = [str substringWithRange:NSMakeRange(priorSemi.location+1, fontRange.location-1)];

The key element here is the NSBackwardsSearch search option.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
1

Personally I think you are better off with a regex. So my solution would be:

Regex of: ([0-9]+)[^;]+;([A-Za-z0-9]+)

Which for all the example text provides 3 matches. ie for:

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(2009&nbsp;RX7)</font></td>

0: 2009&nbsp;RX7)<

1: 2009

2: RX7

I haven't coded this up, but did test the Regex at www.regextester.com

Regex's are implemented via NSRegularExpression and are available in iOS 4.0 and later.

Edit

Given that this appears to be a web scraping application, you never know when those pesky HTML code monkeys will change their output and break your carefully crafted matching methodology. As such I would change my regex to:

([0-9]+)([^;]+;)+([A-Za-z0-9]+)

Which adds an extra group, but allows for any number of &nbsp; elements between the number and the string.

Peter M
  • 7,309
  • 3
  • 50
  • 91
  • Thanks you Peter, thats a great help, is there any chance you can break down what the bits of the exp do. I have not used regular expressions before and want to learn them now, a simple this bit does this would greatly help. – fuzzygoat Jan 26 '12 at 21:15
0

This should do the trick:

NSString *s = @">monospace\" size=\"-1\">&nbsp;&nbsp;4155&nbsp;Accord</font></td>";
NSArray *strArray = [s componentsSeparatedByString:@";"];
// you're interested in last two objects
NSArray *tmp = [strArray subarrayWithRange:NSMakeRange(strArray.count - 2, 2)];

In tmp you'll have something like:

"4155&nbsp",
"Accord</font></td>"

strip unneeded chars and you're all set.

ksh
  • 1,894
  • 19
  • 20
0

Using NSRegularExpression:

NSRegularExpression *regex;
NSTextCheckingResult *match;

NSString *pattern = @"([0-9]+)&nbsp;([A-Za-z0-9]+)[)]?</font></td>";
NSString *string = @">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(2009&nbsp;RX7)</font></td>";

regex = [NSRegularExpression
         regularExpressionWithPattern:pattern
         options:NSRegularExpressionCaseInsensitive
         error:nil];


match = [regex firstMatchInString:string options:0 range:NSMakeRange(0, [string length])];
NSLog(@"'%@'", [string substringWithRange:[match rangeAtIndex:1]]);
NSLog(@"'%@'", [string substringWithRange:[match rangeAtIndex:2]]);

NSLog output:

'2009'
'RX7'

zaph
  • 111,848
  • 21
  • 189
  • 228