0

Array operators execute element by element operations.

Given an m by n matrix of discrete values that take on the set [1,2,3...12], is there an operator (preferable) or elegant algorithm to ensure that any value above 4 is set to 4?

gatorback
  • 1,351
  • 4
  • 19
  • 44
  • 1
    You mean something like `A(A > 4) = 4`? – HansHirse Sep 05 '19 at 04:44
  • @HansHirse Exactly, Please consider posting as an answer with an explanation of how the mechanism works and a reference to relevant MATLAB documentation? – gatorback Sep 05 '19 at 04:51
  • 2
    Or simply `A = min(A,4)`. – Cris Luengo Sep 05 '19 at 05:00
  • 1
    @CrisLuengo At least in Octave, your `min` approach seems to be faster. Any idea why? My guess would be, that "replacing" `A` with the result of `min` is faster than accessing and manipulating the entries of `A`. – HansHirse Sep 05 '19 at 05:27
  • 2
    @HansHirse `A>4` creates an intermediate array of the same size as `A`, which you then use for indexing. You end up using double the memory, and you access elements of `A` twice. In principle, `A = min(A,4)` can work in-place (not sure if Octave can do this, MATLAB does). That means you access each memory element only once, and don't use any additional memory. -- Of course `A(A>4) = 4` could be optimized to not create the intermediate array and access memory only once, but I don't know if MATLAB's JIT does that optimization. – Cris Luengo Sep 05 '19 at 05:33
  • @CrisLuengo Would `tic` and `toc` be a good 'stopwatch' to measure the execution speeds? – gatorback Sep 05 '19 at 16:38
  • 1
    Only if the code takes a significant time to run. For very fast bits of code, put them inside a function and time the function using `timeit`. [This recent answer of mine](https://stackoverflow.com/a/57730219/7328782) shows how to use `timeit`. – Cris Luengo Sep 05 '19 at 16:55

2 Answers2

2

In MATLAB, you can index matrices with logical values. Since comparison operators such as > will be applied to all elements of a matrix, you'll get a result matrix of logicals of the same size. This can then be used to access and manipulate the elements in the initial matrix.

Let's see this short code snippet:

% Set up random integer matrix with max value of 12
A = randi(12, 5, 4)

% Get all elements in A greater than 4
ind = (A > 4)

% Set all elements in A greater than 4 to 4
A(ind) = 4

And, the corresponding output:

A =
    8   12    5    3
    3    2    5    1
    9    9   10    9
    9    9   10    8
    8   10    3   10

ind =
  1  1  1  0
  0  0  1  0
  1  1  1  1
  1  1  1  1
  1  1  0  1

A =
   4   4   4   3
   3   2   4   1
   4   4   4   4
   4   4   4   4
   4   4   3   4

Leaving out the intermediate step, the one-liner would be:

A(A > 4) = 4

Hope that helps!


EDIT

For this specific task, also consider using the approach from Cris Luengo's answer which is faster.

HansHirse
  • 18,010
  • 10
  • 38
  • 67
2

A simple approach I've used in many different programming languages is:

A = min(A,4);

When clamping to a lower and upper bound you get this awkward-looking bit, but it's so common that people recognize its meaning right away:

A = min(max(A,0),4);

@HansHirse claims that A = min(A,4) is faster than A(A > 4) = 4 in Octave. I see this is also the case in MATLAB R2017a (about 3.5x faster for one million elements). Here is my explanation for why (copied from a comment to preserve it):

A>4 creates an intermediate array of the same size as A, which is then used for indexing. The statement A(A > 4) = 4 ends up using double the memory, and accesses elements of A and the intermediate matrix twice (once for the comparison operation, then again for the indexed assignment). If A is sufficiently large that it doesn't fit in cache, accessing elements twice causes them to be read from main memory twice. Memory access is the bottleneck in such a simple operation.

In contrast, A = min(A,4) can work in-place. That means it doesn't use any additional memory. It also reads each element from main memory only once. That is, the operation can be implemented (and most certainly is implemented this way in MATLAB) by running through the array elements only once, comparing each to 4, and adjusting its value if larger.

Of course A(A>4) = 4 could be optimized to not create the intermediate array and access memory only once, but it looks like MATLAB's JIT does not do this optimization. Maybe in MATLAB R2023 it will, who knows?

Cris Luengo
  • 55,762
  • 10
  • 62
  • 120