3

I have a code which consists of a single file containing multiple functions, some of which use persistent variables. In order for it to work correctly, persistent variables must be empty.

There are documented ways to clear persistent variables in a multi-function file, such as:

clear functionName % least destructive
clear functions    % more destructive
clear all          % most destructive

Unfortunately, I cannot guarantee that the user remembers to clear the persistent variables before calling the function, so I'm exploring ways to perform the clearing operation at the beginning of the code. To illustrate the problem, consider the following example:

function clearPersistent(methodId)
if ~nargin, methodId = 0; end

switch methodId
  case 0 
    % do nothing
  case 1
    clear(mfilename);
  case 2
    eval(sprintf('clear %s', mfilename));
  case 3
    clear functions;
  case 4
    clear all;
end

subfunction();
subfunction();
end

function [] = subfunction()
persistent val

if isempty(val)
  disp("val is empty");
  val = 123;
else
  disp("val is not empty");
end
end

When first running this, we get:

>> clearPersistent
val is empty
val is not empty

I would expect that running the function again at this point, with any of the non-0 inputs would result in the val variable being cleared, but alas - this is not the case. After val is set, unless we use one of the alternatives shown in the top snippet externally, or modify the .m file, it remains set.

My question: Is it possible to clear persistent variable in subfunctions from within the body of the main function, and if yes - how?

In other words, I'm looking for some code that I can put in clearPersistent before calling the subfunctions, such that the output is consistently:

val is empty
val is not empty

P.S.

  1. Here's a related past question (which doesn't deal with this specific use case): List/view/clear persistent variables in Matlab.

  2. I'm aware of the possibility of rewriting the code to not use persistent variables at all (e.g. by passing data around, using appdata, adding a 'clear' flag to all subfunctions, etc.).

  3. Please note that editing the source code of the function and saving implicitly clears it (along with all persistent variables).

  4. I'm aware that the documentation states that "The clear function does not clear persistent variables in local or nested functions.


Additional background on the problem:

The structure of the actual code is as follows:

Main function (called once)
  └ Global optimization solver (called once)
    └ Objective function (called an unknown N≫1 times)
      └ 1st function that uses persistents
      └ 2nd function that uses persistents

As mentioned in the comments, there are several reasons why some variables were made persistent:

  1. Loose coupling / SoC: The objective function does not need to be aware of how the subfunctions work.
  2. Encapsulation: It is an implementation detail. The persistent variables do not need to exist outside the scope of the function that uses them (i.e. nobody else ever needs them).
  3. Performance: The persistent variables contain matrices that are fairly expensive to compute, but this operation needs to happen only once per invocation of the main function.

One (side?) effect of using persistent variables is making the entire code stateful (with two states: before and after the expensive computations). The original issue stems from the fact that the state was not being correctly reset between invocations of the main function, causing runs to rely on a state created with previous (and thus invalid) configurations.

It is possible to avoid being stateful by computing the one-time values in the main function (which currently only parses user-supplied configurations, calls the solver, and finally stores/displays outputs), then passing them alongside the user configurations into the objective function, which would then pass them on to the subfunctions. This approach solves the state-clearing problem, but hurts encapsulation and increases coupling, which in turn might hurt maintainability.

Unfortunately, the objective function has no flag that says 'init' etc., so we don't know if it's called for the 1st or the nth time, without keeping track of this ourselves (AKA state).

The ideal solution would have several properties:

  1. Compute expensive quantities once.
  2. Be stateless.
  3. Not pass irrelevant data around (i.e. "need to know basis"; individual function workspaces only contain the data they need).
Community
  • 1
  • 1
Dev-iL
  • 23,742
  • 7
  • 57
  • 99
  • `clear classes` is actually most destructive, but it still won't clear locked items. – LightCC Aug 16 '19 at 00:41
  • On first glance, I believe the solution is outside the scope of the question. i.e. There is not a good solution that doesn't end up being unmaintainable, hides what its doing, isn't very testable, and is just, generally very hacky. If you provided a little more scope to the higher level problem, we may be able to offer alternative solutions. – LightCC Aug 16 '19 at 00:47
  • @LightCC You're right in that it's unclear why exactly I am asking this, where alternative solutions make more sense. Perhaps I oversimplified my problem for the purpose of the question, so I will try to add more context for completeness. – Dev-iL Aug 20 '19 at 06:54
  • As your change in the question has changed the meaning and usefulness of the answers let me to retract my upvote. – rahnema1 Aug 20 '19 at 18:47
  • Have you considered using a class (object oriented programming) and storing a singleton object in global persistent data to access it? Then you can do whatever you want in the class to store the results of the expensive data, store the state, and clear the data and state on whatever conditions/method calls you want... I've only just started messing with classes in Matlab, so not sure how it would best interact with global storage the way you need. – LightCC Aug 21 '19 at 16:51

3 Answers3

3

clear fname and clear functions removes the M-file from memory. The next time you run the function, it is read from disk again, parsed and compiled into bytecode. Thus, you slow down the next execution of the function.

Clearing a function or sub-function from within a function thus does not work. You're running the function, you cannot clear its file from memory.

My solution would be to add an option to subfunction to clear its persistent variable, like so:

function clearPersistent()

subfunction('clear');
subfunction();
subfunction();
end

function [] = subfunction(option)
persistent val

if nargin>0 && ischar(option) && strcmp(option,'clear')
   val = [];
   return
end

if isempty(val)
  disp("val is empty");
  val = 123;
else
  disp("val is not empty");
end

end

Of course you could initialize your value when called as subfunction('init') instead.


A different solution that might work for your usecase is to separate the computation of val and its use. I would find this easier to read than any of the other solutions, and would be more performant too.

function main()
val = computeval();
subfunction(val);
subfunction(val);
end

Given your edit, you could put the objective function in a separate file (in the private subdirectory). You will be able to clear it.

An alternative to persistent variables would be to create a user class with a constructor that computed the expensive state, and another method to compute the objective function. This could also be a classdef file in the private subdirectory. I think this is nicer because you won’t need to remember to call clear.

In both these cases you don’t have a single file containing all the code any more. I think you need to give up on one of those two ideals: either break data encapsulation or split the code across two files (code encapsulation?).

Cris Luengo
  • 55,762
  • 10
  • 62
  • 120
1

Why not using global variables? You can create a global struct that contains your variables and it can be managed using a variable_manager:

function main
    variable_manager('init')
    subfunction1()
    subfunction2()
end

function variable_manager(action)
    global globals
    switch action
        case 'init'
            globals = struct('val',[],'foo',[]);
        case 'clear'
            globals = structfun(@(x)[],globals,'UniformOutput', false);
%       case ....
%       ...
    end
end

function subfunction1
    global globals
    if isempty(globals.val)
        disp("val is empty");
        globals.val = 123;
    else
        disp("val is not empty");
    end
end

function subfunction2
    global globals
    if isempty(globals.foo)
        disp("foo is empty");
        globals.foo = 321;
    else
        disp("foo is not empty");
    end
end
rahnema1
  • 15,264
  • 3
  • 15
  • 27
  • 4
    This is a solution, but not one I would recommend. Globals are problematic because you could have clashing names -- 2 functions unintentionally using a global with the same name. Persistent variables don't have this problem. Global variables, being shared among many functions, cause code that is hard to read: you don't know where they're being changed, it's hard to follow the flow of information. Persistent variables also don't have this problem. – Cris Luengo Aug 15 '19 at 14:48
  • Good points! I think what OP wants is more similar to global variables than persistent variables. Moreover I suggested to manages them in a struct with a distinct name to prevent name clashes. Even they can be managed as `globals.subfunction1.val`. – rahnema1 Aug 15 '19 at 14:53
  • Yes, I kind of like your concept of putting them in a struct like that. It does clarify the code somewhat. But there's still the problem of data being changed by a function call that you might not even realize accesses that data. I always prefer to be explicit about data flow, and pass data into functions as arguments. Persistent variables are visible only within one function, so they cannot be changed elsewhere. – Cris Luengo Aug 15 '19 at 15:02
  • Some more context: the reason why `val` was made `persistent`, is that in my actual code it does not need to "live" outside of `subfunction` (i.e. nobody else ever uses it). Furthermore, computing it is computationally intensive, so after it is found for the first time, I prefer to store it in memory (something like caching/memoization). The value does not change for the rest of the run after it is computed. – Dev-iL Aug 15 '19 at 15:19
  • I as you prefer to not use persistent variables at all. However you need a way to manage all such variables so my suggestion is using global variables. In other words when one has accepted to use persistent variables, with all of their issues, why they do not switch to using global variables? – rahnema1 Aug 15 '19 at 15:33
  • @rahnema1 You're right, persistents aren't very different from globals in the difficulties they create. I'll add more context to the question, attempting to explain why persistents were used in the first place. – Dev-iL Aug 20 '19 at 06:57
0

As mentioned in the question, one of the possibilities is using appdata, which is not too different from global (at least when associating them with "object 0" - which is the MATLAB instance itself). To avoid "collisions" with other scripts/functions/etc. we introduce a random string (if we generate a string in every function that uses this storage technique, it would almost certainly guarantee no collisions). The main downside of this approach is that the string has to be hard-coded in multiple places, or the structure of the code should be changed such that the functions that use this appdata are nested within the function that defines it.

The two ways to write this are:

function clearPersistent()
% Initialization - clear the first time:
clearAppData();

% "Business logic"
subfunction();
subfunction();

% Clear again, just to be sure:
clearAppData();
end % clearPersistent

function [] = subfunction()
APPDATA_NAME = "pZGKmHt6HzkkakvdfLV8"; % Some random string, to avoid "global collisions"
val = getappdata(0, APPDATA_NAME);

if isempty(val)
  disp("val is empty");
  val = 123;
  setappdata(0, APPDATA_NAME, val);
else
  disp("val is not empty");
end
end % subfunction

function [] = clearAppData()
APPDATA_NAME = "pZGKmHt6HzkkakvdfLV8";
if isappdata(0, APPDATA_NAME)
  rmappdata(0, APPDATA_NAME);
end
end % clearAppData

and:

function clearPersistent()
APPDATA_NAME = "pZGKmHt6HzkkakvdfLV8";
% Initialization - clear the first time:
clearAppData();

% "Business logic"
subfunction();
subfunction();

% Clear again, just to be sure:
clearAppData();

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [] = subfunction()
  val = getappdata(0, APPDATA_NAME);

  if isempty(val)
    disp("val is empty");
    val = 123;
    setappdata(0, APPDATA_NAME, val);
  else
    disp("val is not empty");
  end
end % subfunction

function [] = clearAppData()
  if isappdata(0, APPDATA_NAME)
    rmappdata(0, APPDATA_NAME);
  end
end % clearAppData

end % clearPersistent
Dev-iL
  • 23,742
  • 7
  • 57
  • 99
  • 1
    If you're going to use nested functions, just declare the variable in the parent function. Also, this is much less performant than persistent variables. If you're going to call these subfunctions many times, don't use `appdata`. – Cris Luengo Aug 15 '19 at 15:14