1

In my application, I'm receiving a CSV file that contains 30,000 objects and for each object there are always 24 values (a total of 720,000 values).

Format is something like this:

object1,value1,value2,...,value24
object2,value1,value2,...,value24
...
objectn,value1,value2,...,value24

When I parse this file, I convert each row in an NSArray of NSString. Next I do the following for each value of the array:

  1. convert from NSString to float using - (float)floatValue
  2. convert the float to an NSNumber
  3. store the NSNumber in an NSMutableArray

This process takes several seconds and from Instruments Time Profiler I'm spending 3.5 s in step 2 & 3 for the 720,000 values.

How can I proceed to avoid the NSNumber translation? Can I use a C style array, something like []? Or CFMutableArrayRef? If it helps, I know there are always 24 values for each object.

Thanks for the help,

Sébastien.

HAS
  • 19,140
  • 6
  • 31
  • 53
sebastien
  • 2,489
  • 5
  • 26
  • 47
  • try the answer http://stackoverflow.com/questions/1448804/how-to-convert-an-nsstring-into-an-nsnumber for direct conversion of nsstring to nsnumber using nsnumberformatter and share the profiling result with us, please. ps: you just need ONE instance of nsnumberformatter ;) – Mirco Ellmann May 04 '13 at 21:16

2 Answers2

1

Depending on how you plan to use these values later, there are different ways.

  1. Store entire float array as single NSValue. Pros: construction 24x faster. Cons: you must extract all items to access any of them.
  2. Keep values as strings. Pros: no time wasted. Cons: frequent accesses will waste time.
  3. Design a class that keeps single record: one NSString and 24 float properties. Pros: single record rules everything. Cons: single record rules everything.

upd: If you think of inconvenience manually naming 24 fields value1 .. value24 in case 3, feel free to declare public array in interface section of your class. This will combine nativity of record object with c-style array. You may also add -[valueAtIndex:] and -[setValue:atIndex:] methods to that class and make real array private.

0

Personally I'd just use a C-style array. If you want to process the data row by row, you could have an object representing each row, something like this:

@interface Row : NSObject {
  float values[24];
}
@end

Then you create a Row instance for each row, set the 24 values directly, and add the instance to a NSMutableArray.

Row *row = [[[Row alloc] init] autorelease];
// here's where you read in the data for the row and save the 24 values
row.values[0] = ...
...
row.values[23] = ...
// and here you add the Row instance to an NSMutableArray
[rows addObject:row];

Otherwise, if you know up front you're going to be expecting 30,000 rows then you could preallocate a 30,000 x 24 array of floats.

float *rows = calloc(30000*24, sizeof(float));
for (int i = 0; i < 30000; i++) {
  float *values = rows[24*i];
  // here's where you read in the data for row i and save the 24 values
  values[0] = ...
  ...
  values[23] = ...
}

Just don't forget you'll need to free the memory from that calloc when you're done with it.

James Holderness
  • 22,721
  • 2
  • 40
  • 52
  • It's what I was looking for. I will test it today and will let you know the result. Thanks! – sebastien May 05 '13 at 08:36
  • I have updated part of my code using C-Style array as proposed. The results are quite good. In my test I have 38580 objects, each having 24 values. Based on the measurement in Time Profiler before the optimization the duration to split the string, parse the float, create the NSNumber and record the NSNumber in a NSArray was 3705ms and after optimization it's 1728ms. I have also an improvement in memory footprint but I haven't yet measured. Thanks! – sebastien May 05 '13 at 17:47
  • 1
    @user951368 You may also try to keep them as strings (i.e. no conversion after split). If you just show values in some sort of table, and they are not part of expensive computation with multiple access pattern, then you may convert them to floats (and format) only at cell rendering callback. Runtime penalty will be too small to notice, but now you never need to convert it at whole. –  May 06 '13 at 22:55