0

One of my answers was recently downvoted for suggesting use of cd(path_to_toolbox) rather than one of the path tools, such as addpath or rmpath. Given the fervent criticism I received I must imagine that there are very good reasons for using the path tools, presumably they are in some way more robust, especially when code is distributed to other systems.

Then I decided to clock the performance of cd versus addpath and was surprised to find the following result. Prior to each trial I cleared the workspace and created a string array with alternating paths:

clear
clc

p1 = 'c:\MATLAB7\toolbox\symbolic\@sym\';
p2 = matlabroot;

newpath = repmat(' ',100,100);

for ii=1:2:99
    newpath(ii,1:length(p1)) = p1;    
    newpath(ii+1,1:length(p2)) = p2;    
end

Then I ran either addpath or cd as follows:

tic
for ii=1:100
    addpath(newpath(ii,:))
end
toc

Elapsed time is 13.437000 seconds.

tic
for ii=1:100
    cd(newpath(ii,:))
end
toc

Elapsed time is 1.078000 seconds.

Any comments on whether there are conditions under which use of cd might be justified, for instance to set the path to a function (toolbox or otherwise), are appreciated. While it may be considered sloppy, I have used cd for many years and while the slowdown can be appreciable if used repeatedly, I find that if it is not used in highly iterated parts of a program the slowdown is worth the simplicity it brings to coding. Notably, addpath is not more complicated to use, but now I seem to have a real reason to prefer cd: it's actually faster.

Edit

As a postscript to this post I plead mea culpa to perverse use of cd (and in this example, addpath). There should however be room for such usage in what is a language that is frequently used for quick-and-dirty scripting. It should be kept in mind that there is a gradation of expertise among the users of matlab, and in some cases less "advanced" and seemingly sloppy programming techniques can in fact be construed as advantageous in the short term (if not the long term, or where version and directory structure management might become problematic).

As an appendix I include some links to posts on SO and beyond that address built-in function overriding, shadowing, and the like, where addpath (and I would argue cd too) can be used:

How to unhide an overriden function?

How to get a handle to an overriden built-in function?

How to wrap an already existing function with a new function of the same name

http://www.mathworks.in/matlabcentral/newsreader/view_thread/264354

Community
  • 1
  • 1
Buck Thorn
  • 5,024
  • 2
  • 17
  • 27
  • The addpath doesn't need you to change the current matlab directory, it adds it to matlab search path. Suppose you have files in several different folders, then you will have to change the current matlab directory every time you want to use a function or class that is in another path… – Werner Aug 17 '13 at 22:53
  • 3
    What happens if you append the paths rather than the default prepend? Your comparison is really fair though. The point is that the path can be set and saved, whereas `cd` would need to be called every time in a scripts (twice if you want to return the user to their last `pwd`). In some cases `cd`-ing could potentially break stuff and lead to unexpected errors, e.g., passing a function that uses content from the user's `pwd` into a main function that `cd`s back and forth. – horchler Aug 18 '13 at 01:38
  • 1
    @horchler: Are you missing a "NOT" in there (NOT fair comparison) :) Good point about appending rather than prepending. – Amro Aug 18 '13 at 05:51
  • There is also an issue with version management. If I copy an entire program with multiple functions to a new directory, then an initial cd will accomplish as much as addpath, and given the impermanence of the path under the circumstances, either choice seems ok - assuming all code is copied to the same dir level. – Buck Thorn Aug 18 '13 at 06:30
  • @Amro: Oops. Yes, "*not a*" is missing. I can't type. – horchler Aug 18 '13 at 15:58
  • @TryHard: I'd still avoid putting `cd`s in any regular code. There's `copyfile` if you need copy the contents of a directory to another place. Sadly there's *still* no real version management in Matlab. It would be nice if there was a builtin method for "installing" and "uninstalling" a toolbox or M-File. – horchler Aug 18 '13 at 16:05
  • You apparently still don't understand MATLAB, and what the search path means. YOU DON'T add new directories on the fly using addpath. The MATLAB toolbox directories should be on your search path already. That addpath was slow is because MATLAB had to completely rehash the entire search path. I would also add that I removed the downvote when you removed the suggestion to CD to a MATLAB toolbox directory. –  Aug 19 '13 at 02:55
  • @woodchips The basis for your confusion regarding my understanding of `addpath` is that I used it instead of `cd` in an unusual situation in which I needed some way of pointing directly at the toolbox. While inappropriate as general usage, it is similar to what has been used in other examples and answers presented on SO. I would go into further explanations but it is best left as is. Thanks for un-downvoting the answer. I'm glad to share this forum with such wise people. – Buck Thorn Aug 19 '13 at 07:35
  • @woodchips I should add that the timing experiment is somewhat absurd, but as I explain in the question I would never use `addpath` or `cd` in such a way in a real application, it was just a conceptual exercise to drive the question: when is use of `cd` appropriate? I might, for instance, want to open files from within a script and expect them at default locations relative to a main path. One way to place the script on the main path without need for string concatenation operations or passing of the main path name is to `cd` to the appropriate location. – Buck Thorn Aug 19 '13 at 08:24
  • There are other uses for `cd` within scripts which might not be considered appropriate in distributed software but are perfectly useful for quick-and-dirty applications. Well, I welcome you to post an answer. You might start for instance, by pointing to these useful links which Amro was kind enough to provide, on appropriate use of `path`: http://stackoverflow.com/questions/2129646/how-to-use-the-matlab-search-path/2130404#2130404 http://www.mathworks.se/help/matlab/matlab_env/files-and-folders-that-matlab-accesses.html – Buck Thorn Aug 19 '13 at 08:24
  • A practical note: If using `cd` were a faster option, it only allows you to add 1 path at a time, whilst `addpath` allows you to add all paths simultaneously. If you have to add many paths, `addpath` is likely to be faster whilst if you only add one per time speed is not likely a concern. – Dennis Jaheruddin Aug 19 '13 at 09:37
  • 1
    @DennisJaheruddin: good point, MATLAB also has the `genpath` function to generate the path string for a folder and all its subfolders, to be used with `addpath` – Amro Aug 19 '13 at 19:38

2 Answers2

3

Obviously as the path gets longer, there would be more locations MATLAB has to search to look for functions, scripts, classes, etc.. So I imagine it would have a negative impact on performance if you have a really long path

On the other hand, the current directory is just one location that has to be searched (respecting the order of precedence of course).

Plus it is not fair to compare the two, unless it is ok for you to put all your files in a single folder.


Just a note about your coding style: you could use a cellarray of strings rather than a char-matrix to store newpath:

newpath = cell(100,1);
for i=1:100
    newpath{i} = '...';
end
Amro
  • 123,847
  • 25
  • 243
  • 454
  • A note about performance: MATLAB usually caches information for directories on the path. Also, in order to detect changes in the file system and automatically use the newest version of an M-file (without explicitly rehashing), MATLAB register handlers for change notification in the file system (see `help changeNotification`) – Amro Aug 17 '13 at 23:07
  • Yes, but the question asked whether there would be reasons to use `cd` rather than `path` on the basis of speed (or other). If I read your answer correctly there is (are?) and there was no ground for my being downvoted on an earlier question. – Buck Thorn Aug 18 '13 at 05:43
  • I should probably downvote you on the use of `i` in the loop: that is bad programming style :>) – Buck Thorn Aug 18 '13 at 06:03
  • no, my point was that your test is not a fair comparison. It is not practical nor portable to keep jumping between folders to use functions that reside in different places. The `path` was designed for this purpose. – Amro Aug 18 '13 at 06:06
  • haha, the `i` thing is a habit hard to break :) Plus I really dont mind it, as I always use `1i` whenever I mean to get the imaginary unit, no confusion there.. – Amro Aug 18 '13 at 06:09
  • I understand the point. But I had a situation where I was handling conflicting meanings of an overloaded function (diff) and it was necessary to point directly to the function. In either case it the example I based my answer on had to make *explicit* use of the path to a toolbox function, which *either* path or cd would have used. Thus their use in this *particular* case would be equivalent. I was not promulgating poor coding style, I was looking for a workaround for one somewhat unusual (but not actually) case. – Buck Thorn Aug 18 '13 at 06:11
  • hmm, couldnt you use `builtin` to call the original function from the overloaded one? – Amro Aug 18 '13 at 06:17
  • The script defaults to use of the builtin function. Hmmm, I think I might see your point though. – Buck Thorn Aug 18 '13 at 06:23
  • 1
    I'm not sure what the OP in that linked question was trying to do, but I dont see the source of confusion. If you want symbolic differentiation you should define `x` as a symbolic variable (`syms x`), if you want numeric diff, define `x` as such (`x = 1:100`). Unless I'm misunderstanding, there is no overlap and no need to to mess with `path`. You see MATLAB dispatches function calls based on the type of the first argument, so if `x` is symbolic, it will call `@sym/diff`, otherwise the regular one. Remember that `fun(obj)` is like `obj.fun()` for objects method calls – Amro Aug 18 '13 at 06:42
  • See this for more info: http://www.mathworks.com/help/matlab/matlab_oop/ordinary-methods.html#brd2n2o-1 – Amro Aug 18 '13 at 06:47
  • You're right ... but what if copies of a function conflict. Of course that is bad programming style as well, but it can easily happen with version management if I make a copy of a program. – Buck Thorn Aug 18 '13 at 06:50
  • I should also note that in this particular case matlab's response to an overloaded function, that is picking the right version based on input, failed. – Buck Thorn Aug 18 '13 at 06:53
  • I'm not sure what case we are talking about.. To keep this discussion on point, could you edit your question with a specific example where invoking a function fails and calls the wrong overloaded version? – Amro Aug 18 '13 at 07:00
1

I think that if you use cd to add something to the path like this, most of the disadvantages should be avoided:

function addpathwithcd(pathToAdd)
currentPath = pwd;
cd(pathToAdd);
cd(currentPath);

However, after doing a (very small) test, this does not seem to be faster for me than simply using addpath(pathToAdd).

Actually this is a bit of a suprise for me as you record a factor 13 speed difference while I only use CD twice, thus i expected a factor 6 speed difference or so.

Dennis Jaheruddin
  • 21,208
  • 8
  • 66
  • 122