3

I've noticed that creating fairly large tables in Matlab (>10,000 rows) can be quite slow because of a single function called by the constructor, checkDuplicateNames. However, I commonly am sure that the names I'm passing the table are already unique.

The following illustrates the problem well. Generating 10,000 random values takes less than a millisecond but generating a table of random values with string row names takes a second and a half with 1.4 second taken by checking for duplicate row names:

profile on; 
a = rand(10000,1);
strind = cellstr(num2str((1:10000)'));
b = table(a, 'RowNames', strind); 
profile viewer

I'm curious then, is there an alternate way to create tables in Matlab without calling the checkDuplicateNames function?

David Kelley
  • 1,418
  • 3
  • 21
  • 35
  • 1
    Since the `table.m` function is editable, why not copy its contents and create a `tablewithoutchecks.m` file by removing the slow lines? – Wouter Kuijsters Mar 11 '15 at 21:20
  • A bit of digging around shows that you could do this by altering the way `setRowNames` is called. There is a `allowDups` argument, that when set to `1`, bypasses the check that takes up so much time. Sadly, I couldn't find how you can set `allowDups = 1` at the time you call the `table` function, so you may still have to create some duplicate functions to get it to work. – Wouter Kuijsters Mar 11 '15 at 21:27
  • @Wouter: Because it's more than table.m (actually a whole folder of `@table` in the base Matlab installation) that I would have to copy, I'm doubtful that this approach would give me a `table` object at the end instead of a new type that doesn't work with functions written for tables. – David Kelley Mar 11 '15 at 21:28
  • good point, that would probably give you some errors. A workaround could also be to alter the `setRowNames.m` at the start of your script, putting a `%` in front of line 65 (where the check is done) and removing it again once you are finished. But it certainly would be a dirty workaround... – Wouter Kuijsters Mar 11 '15 at 21:34
  • Do you really need tables? Can you make do with a matrix or cell array? Perhaps that would be faster – Luis Mendo Mar 11 '15 at 21:53
  • 1
    @Luis: I need a way to associate the values from several series with labels, which is what tables seem to do to me. If I'd known about this a few months ago, I maybe could have worked around it but now I've got a whole project that's simply passing tables around everywhere that I'm trying to incrementally improve on. – David Kelley Mar 11 '15 at 21:59
  • 1
    @DavidKelley Can't you shadow the `checkDuplicateNames` function with a version of your own that does nothing? – Luis Mendo Mar 11 '15 at 22:14
  • @Luis: I like the idea but I can't figure out how to shadow that function. According to the [function precedence order](http://www.mathworks.com/help/matlab/matlab_prog/function-precedence-order.html), object functions are called before other functions on the path so I don't think I can without creating a new class folder like Wouter suggested. – David Kelley Mar 12 '15 at 15:11
  • @DavidKelley Oh, I see, it's an object function. I think I agree with you then (I haven't used object functions) – Luis Mendo Mar 12 '15 at 15:27

1 Answers1

1

Based on this reply from a MathWorks employee, you can't do it without altering the core Matlab files.

David Kelley
  • 1,418
  • 3
  • 21
  • 35