I don't know data structure of array in MATLAB. Does it use a FiFo?
I try to delete the last element from a column and row vector. The time complexity depends on size N
.
How to delete last element in O(1)
?
Seminally pop()
.
I don't know data structure of array in MATLAB. Does it use a FiFo?
I try to delete the last element from a column and row vector. The time complexity depends on size N
.
How to delete last element in O(1)
?
Seminally pop()
.
Following the timing script from this question, which deals with the opposite problem -- growing a vector, I tried both approaches to remove the last element:
building_array = building_array(1:end-1);
building_array(end) = [];
As Tommaso noted, the former (blue) is faster than the latter (red):
My guess as to why these two forms have different timings is that MATLAB's JIT (Just in Time Compiler) is more optimized for the one syntax than the other. There is no technical reason for one being faster than the other.
I am really surprised that the cost is linear in the number of elements, very different from the behavior of adding an element at the time.
Test code (modified from Peter Barrett Bryan):
num_averages = 500;
num_sims = 10000;
time_store = nan(num_sims, num_averages);
for i = 1:num_averages
building_array = rand(num_sims,1);
for j = 1:num_sims
tic;
building_array = building_array(1:end-1);
time_store(j, i) = toc;
end
end
After a quick test, I found out that an in-place reassignment is much faster than an element deletion. But the performance of both operations still depends on the vector size... I simply think this cannot be achieved in O(1)
as you may wish, due to how Matlab internally handles memory.
First approach:
A = rand(100,1);
tic();
A(end) = [];
toc(); % Average elapsed time: 0.000015 seconds
B = rand(10000,1);
tic();
B(end) = [];
toc(); % Average elapsed time: 0.000061 seconds
Second approach:
A = rand(100,1);
tic();
A = A(1:end-1);
toc(); % Average elapsed time: 0.000007 seconds
B = rand(10000,1);
tic();
B = B(1:end-1);
toc(); % Average elapsed time: 0.000017 seconds
My knowledge of Matlab is not deep enough to allow me to explain exactly what happens under the hood and why there is such a big difference between the two approaches. But I can try a guess.
In the first approach, Matlab has to:
end
to a real vector offset;total_elements - removed_elements
;In the second one, Matlab has to:
end
to a real vector offset;indexed_elements
;And yet, we are far from O(1)
.
DISCLAIMER: This is not a real suggestion, it is merely a stupid way to actually obtain an O(1) run-time
Right, with the disclaimer given, it is actually possible to make an O(1) pop command in "Matlab". The solution is to not do it in Matlab, but in Python. Confused?
Basically, you convert the vector to a Python list with py.list()
, whereafter the O(1) pop
command is possible to execute. Thus you can do something like:
a = randn(1,1e4);
li=py.list(a);
b = li.pop;
However, as you might have guessed typecasting and running python through Matlab is not exactly what I would call fast. So even though we can maintain a constant run-time that constant is simply too large for it to be of any use.
In the figure, blue is the Matlab/Python solution, whereas red(-ish) is the best solution, as given by Tommaso and Cris.
As it is clear, we maintain what looks like O(1), but at an cost.
Code for reference:
num_averages = 100;
num_sims = 10000;
time_store = nan(num_sims, num_averages);
for i = 1:num_averages
building_array = rand(1,num_sims);
li = py.list(building_array);
for j = 1:num_sims
tic;
li.pop;
time_store(j, i) = toc;
end
end
EDIT: The size at which this approach is faster than the pure Matlab solution is actually within some reasonable limit ~150000.
Note: The fluctuations in this figure are larger as my patience ran out and thus I reduced the number of averages to 5.
Make no mistake, this solution is still stupid and should not be used. The conversion is linear in size, converting back again destroys it even more. Thus the only case, where this solution is actually better is if you are solely interested in the popped element, though, in this case, a for-loop with indexing does the same thing a lot faster.
Depends on array length, use tic <_code_under_test_> toc
, to measure the time
A = ones(1,50);
B = ones(1,10);
tic
A(1) = [];
toc
tic
A(49) = [];
toc
tic
B(1) = [];
toc
tic
B(9) = [];
toc
Elapsed time is 0.000195 seconds.
Elapsed time is 0.000085 seconds.
Elapsed time is 0.000061 seconds.
Elapsed time is 0.000051 seconds.
You can see, it's faster to delete the last element of array than the first.