How can I dirty parts of an array efficiently?

Question

EDIT: apparently I asked this question wrong. Before voting to close, please allow me to know what the question is missing. I promise you, this is not an unanswerable question. You can always come back and vote to close it later.

I'm currently working in C++, but I think this question applies to most compiled languages.

In my project, I have an array of values which are calculated individually, one at a time, as late as possible based off a single variable. These values are not all calculated at once, they are calculated if and only if they need to be. As is normally the case when using "dirty", the objective is to label certain things as being in need of update, without updating it preemptively. These values are cycled through over and over, so I'd like to cache the computation if possible. Whenever the single variable changes, all the values should be marked dirty so the cycle knows to recalculate before storing and moving on.

I can think of a few ways of achieving this, but I'm not sure what is most efficient:

Have two arrays, one of booleans and one of values. Mark all booleans to false if dirty, and true when clean.
Have a clean start point. Consider everything dirty until passing that cycle point again. Has the drawback of not allowing skipping of cycle entries.
Brand new array. Just create a new array, if any of the items are unset, set them. This one seems like it would have tons of problems, but it's a thought.
Perhaps use some built in class meant for this stuff?

The above are just the first things that came to mind for me, but I'm kind of new to c++ and would like to have some idea of normal or special solutions to marking an array dirty.

How can I dirty an array efficiently?

In order to show an example of code, I will show js which I'm more used to:

const numbers = [];
const clean = [];
let length = 1000;

let variable;
const setVariable(num) => {
  variable = num;
  for (let i = 0; i < length; i++) { clean[i] = false; }
}
setVariable(42);

let pos = 0;
while (true) {
  if (clean[pos] == false) {
    clean[pos] = true;
    numbers[pos] = someIntensiveMath(pos, variable);
  }
  doSomethingWithNumbers(numbers[pos]);
  pos++;
  if (pos >= length) pos = 0;
  // wait a bit;
}

in js you could also do

const setVariable(num) => {
  variable = num;
  numbers = [];
}
const isDirt = numbers[pos] === undefined;

With js the latter would probably be faster due to the native implementation of the script, but I don't think that's the case with compiled languages. I think you guys do things differently.

It isn't clear what you want to achieve. If the entire array is dependent on a single var isn't a single dirty flag sufficient? If you are multi-threading just use a mutex block for reading/writing the var. — Srini, Dec 14 '18 at 18:43
I apologize for not being clear enough. The variable can change many times throughout the cycle process, so one flag would not be enough to tell where the array is clean vs dirty. The array length never changes. — Seph Reed, Dec 14 '18 at 18:48
This question is quite broad, and the "how can i ... efficiently" part is an indicator that this will rather elicit opinionated answers - why not start by giving your "simple" (unoptimized) solution and then asking for improvements? It's generally a good approach to start simple and improve from there... and the most important question of all: are you 100% sure that this dirty flagging is the biggest bottleneck in your application? if not (or you are not sure about that), then I'd recommend doing some measurements first! — codeling, Dec 14 '18 at 18:51
why don't you do this with structures? If a member variable is to be changed then make a new structure instance and add the changes... — Dr t, Dec 14 '18 at 18:54
The usual way would be a map. If a key exists, then the value is the cached entry. Clear the map when the variable changes. — stark, Dec 14 '18 at 18:57
Wouldn't a map be less efficient than an array is almost any case? Or are they optimized to be interchangeable? — Seph Reed, Dec 14 '18 at 19:03
@SephReed: The question says "I have an array of values which are calculated based off a single variable." which means they're either all dirty, or none are. But a comment says "one flag would not be enough to tell where the array is clean vs dirty." Can you clarify the question? Without that, the question remains unclear. — Mooing Duck, Dec 14 '18 at 19:15
I think maybe I should have said "diry parts of an array" instead. I thought that the "dirty" paradigm, specifically as applies to arrays, was perhaps a bit more common than it is. In any case, I tried to add enough information to make it really obvious why a single flag wouldn't work. — Seph Reed, Dec 14 '18 at 19:21

score 1 · Answer 1 · answered Dec 14 '18 at 19:12

I've found elsewhere that the typical way to label entries of an array "dirty" is by having a parallel array of booleans.

@stark mentioned in the comments the idea of using a map, and speed comparisons of the two appear to be pretty decent, but it was advised in the following answer to use an array for indexed items.

performance of array vs. map

Whether or not changes in modern coding have led to a new defacto way of labeling items or parts of an array (or linear collection of items) as "dirty" is unknown. But in the very least, as an answer, the most straight forward nooby way is to have a parallel array of booleans.

Further, depending on the way you are iterating through the "array" it may make sense to use vector or map. In the case of either of those, the form of dirtying would probably be best done by clearing the map or removing vector entries(??).

So, to give my best answer, it would seem one should first find the storage method that best fits their needs, and then use whichever method is most normal for that.

For arrays, as this question was specified towards, parallel arrays appears to be the answer.

How can I dirty parts of an array efficiently?

1 Answers1