1

I don't know data structure of array in MATLAB. Does it use a FiFo?

I try to delete the last element from a column and row vector. The time complexity depends on size N.

How to delete last element in O(1)?

Seminally pop().

user3666197
  • 1
  • 6
  • 50
  • 92
Slk
  • 13
  • 2
  • I recommend for you to check [that](https://stackoverflow.com/questions/36257873/time-complexity-of-element-accessing-in-matlab) – mooga Apr 02 '18 at 17:30
  • 1
    When appending elements, MATLAB allocates a larger memory block than needed, so repeatedly appending elements is amortized O(log(n)) rather than O(n) per element. I would assume that deleting the last element is O(1). – Cris Luengo Apr 02 '18 at 18:44

4 Answers4

3

Following the timing script from this question, which deals with the opposite problem -- growing a vector, I tried both approaches to remove the last element:

building_array = building_array(1:end-1);
building_array(end) = [];

As Tommaso noted, the former (blue) is faster than the latter (red):

enter image description here

My guess as to why these two forms have different timings is that MATLAB's JIT (Just in Time Compiler) is more optimized for the one syntax than the other. There is no technical reason for one being faster than the other.

I am really surprised that the cost is linear in the number of elements, very different from the behavior of adding an element at the time.

Test code (modified from Peter Barrett Bryan):

num_averages = 500;
num_sims = 10000;
time_store = nan(num_sims, num_averages);
for i = 1:num_averages
    building_array = rand(num_sims,1);
    for j = 1:num_sims
        tic;
        building_array = building_array(1:end-1);
        time_store(j, i) = toc;
    end
end
Cris Luengo
  • 55,762
  • 10
  • 62
  • 120
1

After a quick test, I found out that an in-place reassignment is much faster than an element deletion. But the performance of both operations still depends on the vector size... I simply think this cannot be achieved in O(1) as you may wish, due to how Matlab internally handles memory.

First approach:

A = rand(100,1);

tic();
A(end) = [];
toc(); % Average elapsed time: 0.000015 seconds

B = rand(10000,1);

tic();
B(end) = [];
toc(); % Average elapsed time: 0.000061 seconds

Second approach:

A = rand(100,1);

tic();
A = A(1:end-1);
toc(); % Average elapsed time: 0.000007 seconds

B = rand(10000,1);

tic();
B = B(1:end-1);
toc(); % Average elapsed time: 0.000017 seconds

My knowledge of Matlab is not deep enough to allow me to explain exactly what happens under the hood and why there is such a big difference between the two approaches. But I can try a guess.

In the first approach, Matlab has to:

  • evaluate end to a real vector offset;
  • find out how many elements have to be removed;
  • allocate a new array of size total_elements - removed_elements;
  • take the untouched portions of the array and buffer copy them into the new array;
  • replace the previous array reference with the new one;
  • deallocate the previous array.

In the second one, Matlab has to:

  • evaluate end to a real vector offset;
  • allocate a new array of size indexed_elements;
  • take the desired ranges of the array and buffer copy them into the new array;
  • replace the previous array reference with the new one;
  • deallocate the previous array.

And yet, we are far from O(1).

Tommaso Belluzzo
  • 23,232
  • 8
  • 74
  • 98
  • I don't understand the difference between your two suggested scenarios. `end` needs to be evaluated in both cases. I'm hoping MATLAB would not copy over the array to remove one element, but maybe it's doing just that. Weird. But my guess to any difference in execution time is that JIT has been optimized for one syntax but not (as much) for the other. – Cris Luengo Apr 03 '18 at 00:58
  • I've made a more careful timing experiment. This behavior is surprising! – Cris Luengo Apr 03 '18 at 05:26
1

DISCLAIMER: This is not a real suggestion, it is merely a stupid way to actually obtain an O(1) run-time

Right, with the disclaimer given, it is actually possible to make an O(1) pop command in "Matlab". The solution is to not do it in Matlab, but in Python. Confused?

Basically, you convert the vector to a Python list with py.list(), whereafter the O(1) pop command is possible to execute. Thus you can do something like:

a = randn(1,1e4);
li=py.list(a);
b = li.pop;

However, as you might have guessed typecasting and running python through Matlab is not exactly what I would call fast. So even though we can maintain a constant run-time that constant is simply too large for it to be of any use.

In the figure, blue is the Matlab/Python solution, whereas red(-ish) is the best solution, as given by Tommaso and Cris.

Timed runs

As it is clear, we maintain what looks like O(1), but at an cost.

Code for reference:

num_averages = 100;
num_sims = 10000;
time_store = nan(num_sims, num_averages);
for i = 1:num_averages
    building_array = rand(1,num_sims);
    li = py.list(building_array);
    for j = 1:num_sims
        tic;
        li.pop;
        time_store(j, i) = toc;
    end
end

EDIT: The size at which this approach is faster than the pure Matlab solution is actually within some reasonable limit ~150000.

Larger sized array

Note: The fluctuations in this figure are larger as my patience ran out and thus I reduced the number of averages to 5.

Make no mistake, this solution is still stupid and should not be used. The conversion is linear in size, converting back again destroys it even more. Thus the only case, where this solution is actually better is if you are solely interested in the popped element, though, in this case, a for-loop with indexing does the same thing a lot faster.

Nicky Mattsson
  • 3,052
  • 12
  • 28
0

Depends on array length, use tic <_code_under_test_> toc, to measure the time

A = ones(1,50);
B = ones(1,10);

tic
A(1) = [];
toc
tic
A(49) = [];
toc
tic
B(1) = [];
toc
tic
B(9) = [];
toc

Elapsed time is 0.000195 seconds.
Elapsed time is 0.000085 seconds.
Elapsed time is 0.000061 seconds.
Elapsed time is 0.000051 seconds.

You can see, it's faster to delete the last element of array than the first.

pfRodenas
  • 326
  • 1
  • 2
  • 12
  • I'm seeing roughly the same elapsed time for arrays of 100, 1000 and 5000 elements over 1000 runs. Perhaps you need more runs over larger arrays. Why did you include deletion of the first element? Do you expect that to have the same time complexity as deleting the last element? – beaker Apr 02 '18 at 18:12
  • You should use `timeit` to measure time cost. I'm not sure what you are measuring here, but it will differ if you put this code in a function or copy-paste it to the command prompt. – Cris Luengo Apr 02 '18 at 18:41
  • The function `tic toc` records the internal time at execution – pfRodenas Apr 02 '18 at 19:00
  • 1
    @pfRodenas See https://www.mathworks.com/help/matlab/matlab_prog/measure-performance-of-your-program.html#bt1ltzm – beaker Apr 02 '18 at 19:04
  • @pfRodenas: I know what `tic` and `toc` do. What I said is that I'm not sure about what you are measuring here. You're recording more than the cost of executing the one statement. It's way too cheap to be measured correctly. – Cris Luengo Apr 03 '18 at 00:53
  • @CrisLuengo yes you're right, I always have been using tic toc, to measure my code in regar to other codes. But in this question `timeit` – pfRodenas Apr 03 '18 at 09:40