4

I have an NSArray of NStrings, I got this from NSLog when printing the array. Here is the code I have implemented:

NSMetadataQuery *query = [[NSMetadataQuery alloc] init];
.....
NSArray *queryResults = [[query results] copy];

for (NSMetadataItem *item in queryResults)
{
    id value = [item valueForAttribute: kMDItemAlbum];
    [databaseArray addObject: value];
}

"The Chronicles Of Narnia: Prince Caspian",
"Taste the First Love",
"Once (Original Soundtrack)",
"430 West Presents Detroit Calling",
"O\U0308\U00d0\U00b9u\U0301\U00b0\U00aeA\U0300O\U0308A\U0300O\U0308I\U0301A\U030a-O\U0301a\U0300A\U0302\U00a1",
"\U7ea2\U96e8\U6d41\U884c\U7f51",
"I\U0300\U00ab\U00bc\U00abO\U0303A\U030aE\U0300y\U0301\U00b7a\U0301",
"A\U0303n\U0303\U00b8e\U0300\U00b2I\U0300C\U0327U\U0300",
"\U00bb\U00b3A\U0308i\U0302O\U0303\U00bdO\U0301N\U0303",
"American IV (The Man Comes Aro",
"All That We Needed",

Now how can I change the human-unreadable strings to human-readable strings? Thanks.

Li Fumin
  • 1,383
  • 2
  • 15
  • 31
  • How do you obtain these strings? What are their original byte representations and how do you convert them to `NSString` objects? –  Sep 16 '11 at 13:51
  • I have post the code in the main topic. – Li Fumin Sep 16 '11 at 14:23

3 Answers3

2

Looking past the escaping done by description (e.g., \U0308), the strings are wrong (e.g., “Öйú°®ÀÖÀÖÍÅ-Óà¡”) because the data you got was wrong.

That's probably not Spotlight's fault. (You could verify that by trying a different ID3-tag library.) Most probably, the files themselves contain poorly-encoded tags.

To fix this:

  1. Encode it in the 8-bit encoding that matches the characters. You can't just pick an encoding (like “ASCII”, which Cocoa mapped to ISO Latin 1 the last time I checked) at random; you need to use the encoding that contains all of the characters in the input and encodes them correctly for what you're going to do next. Try ISO Latin 1, ISO Latin 9, Windows codepage 1252, and MacRoman, in that order.
  2. Decode the encoded data as UTF-8. If this fails, go back to step 1 and try a different encoding.

If step 2 succeeds on any attempt, that is your valid data (unless you're very unlucky). If it fails on all attempts, the data is unrecoverable and you may want to warn the user that their input files contain bogus tags.

Peter Hosey
  • 95,783
  • 15
  • 211
  • 370
  • About step 1,I still can't figure out how to code this.Would you provide some code snippets? Thanks a lot. – Li Fumin Sep 17 '11 at 00:11
  • @Li Fumin: http://developer.apple.com/library/mac/documentation/Cocoa/Reference/Foundation/Classes/NSString_Class/Reference/NSString.html#//apple_ref/occ/instm/NSString/dataUsingEncoding: – Peter Hosey Sep 17 '11 at 00:56
  • I have spent hours being try most of the encoding included in 'CFStringEncodingExt.h',but still cann't find out the proper encoding to recover the string. `NSData *data = [item dataUsingEncoding: CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingISOLatin2)]; NSString *decodeString = [[[NSString alloc] initWithData: data encoding: NSUTF8StringEncoding] autorelease]; NSLog(@"%@ ==> %@",item, decodeString);` – Li Fumin Sep 17 '11 at 05:57
  • 1
    @Li Fumin: Don't forget to try the built-in `NSStringEncoding` values first. If none of those works, the string has probably been corrupted more than once, and finding the correct combination of encodings to recover the original would be next to impossible. – Peter Hosey Sep 17 '11 at 06:37
  • Just found something usefull,may be a bit outdated,but still helpful.[How to detect string encoding](http://www.macosxguru.net/article.php?story=20030808081801868) – Li Fumin Sep 17 '11 at 07:35
1

These strings are utf-8 encoded. You can decode them by:

NSString *myDecoded = [NSString stringWithUTF8String:myEscapedString];

So to process your complete array 'completeArray' you can convert to a const char* first and then back into NSString:

NSMutableArray *processed = [NSMutableArray arrayWithCapacity:completeArray.count];
for (NSString* s in completeArray) {
    [processed addObject:[NSString stringWithUTF8String:[s cStringUsingEncoding:ASCIIEncoding]]];
}
cellcortex
  • 3,166
  • 1
  • 24
  • 34
  • 2
    It is not working.'stringWithUTF8String:' should take a (const char*) argument, not an NSString , right? I think this may involve string encoding detection. – Li Fumin Sep 16 '11 at 11:52
  • “`ASCIIEncoding`” doesn't exist, and trying to encode UTF-8 as “ASCII” is likely not to work; you need to use an encoding that contains all of the characters in the string (specifically, the encoding it was originally encoded with when the input file was written). It's also more efficient to encode to and decode from NSData, rather than creating and then reading a C string. – Peter Hosey Sep 16 '11 at 19:39
1

Parsing these kind of strings aren't particularly easy: See this SO post for background. It's got links to other SO posts with specific ways of handling this problem.

Community
  • 1
  • 1
Tim Dean
  • 8,253
  • 2
  • 32
  • 59