TL;DR: It's possible, with some fancy OOP M-code trickery. Altering the behavior of ()
and .
can be done with a Matlab wrapper class that defines subsref
on top of your Java wrapper classes. But because of the inherent Matlab-to-Java overhead, it probably won't end up being any faster than normal Matlab code, just a lot more complicated and fussy. Unless you move the logic in to Java as well, this approach probably won't speed things up for you.
I apologize in advance for being long-winded.
Before you go whole hog on this, you might benchmark the performance of Java structures as called from your Matlab code. While Java field access and method calls are much faster on their own than Matlab ones, there is substantial overhead to calling them from M-code, so unless you push a lot of the logic down in to Java as well, you might well end up with a net loss in speed. Every time you cross the M-code to Java layer, you pay. Have a look at the benchmark over at this answer: Is MATLAB OOP slow or am I doing something wrong? to get an idea of scale. (Full disclosure: that's one of my answers.) It doesn't include Java field access, but it's probably on the order of method calls due to the autoboxing overhead. And if you are coding Java classes as in your example, with getter and setter methods instead instead of public fields (that is, in "good" Java style), then you will be incurring the cost of Java method calls with each access, and it's going to be bad compared to pure Matlab structures.
All that said, if you wanted to make that x = [foo(1:2).bar]
syntax work inside M-code where foo
is a Java array, it would basically be possible. The ()
and .
are both evaluated in Matlab before calling to Java. What you could do is define your own custom JavaArrayWrapper class in Matlab OOP corresponding to your Java array wrapper class, and wrap your (possibly wrapped) Java arrays in that. Have it override subsref
and subsasgn
to handle both ()
and .
. For ()
, do normal subsetting of the array, returning it wrapped in a JavaArrayWrapper. For the .
case:
- If the wrapped object is scalar, invoke the Java method as normal.
- If the wrapped object is an array, loop over it, invoke the Java method on each element, and collect the results. If the results are Java objects, return them wrapped in a JavaArrayWrapper.
But. Due to the overhead of crossing the Matlab/Java barrier, this would be slow, probably an order of magnitude slower than pure Matlab code.
To get it to work at speed, you could provide a corresponding custom Java class that wraps Java arrays and uses the Java Reflection API to extract the property of each selected array member object and collect them in an array. The key is that when you do a "chained" reference in Matlab like x = foo(1:3).a.b.c
and foo
is an object, it doesn't do a stepwise evaluation where it evaluates foo(1:3)
, and then calls .a
on the result, and so on. It actually parses the entire (1:3).a.b.c
reference, turns that in to a structured argument, and passes the entire thing in to the subsref
method of foo
, which has responsibility for interpreting the entire chain. The implicit call looks something like this.
x = subsref(foo, [ struct('type','()','subs',{{[1 2 3]}}), ...
struct('type','.', 'subs','a'), ...
struct('type','.', 'subs','b'), ...
struct('type','.', 'subs','c') ] )
So, given that you have access to the entire reference "chain" up front, if foo
was a M-code wrapper class that defined subsasgn
, you could convert that entire reference to a Java argument and pass it in a single method call to your Java wrapper class which then used Java Reflection to dynamically go through the wrapped array, select the reference elements, and do the chained references, all inside the Java layer. E.g. it would call getNestedFields()
in a Java class like this.
public class DynamicFieldAccessArrayWrapper {
private ArrayList _wrappedArray;
public Object getNestedFields(int[] selectedIndexes, String[] fieldPath) {
// Pseudo-code:
ArrayList result = new ArrayList();
if (selectedIndexes == null) {
selectedIndexes = 1:_wrappedArray.length();
}
for (ix in selectedIndexes) {
Object obj = _wrappedArray.get(ix-1);
Object val = obj;
for (fieldName in fieldPath) {
java.lang.reflect.Field field = val.getClass().getField(fieldName);
val = field.getValue(val);
}
result.add(val);
}
return result.toArray(); // Return as array so Matlab can auto-unbox it; will need more type detection to get array type right
}
}
Then your M-code wrapper class would examine the result and decide whether it was primitive-ish and should be returned as a Matlab array or comma-separated list (i.e. multiple argouts, which get collected with [...]
), or should be wrapped in another JavaArrayWrapper M-code object.
The M-code wrapper class would look something like this.
classdef MyMJavaArrayWrapper < handle
% Inherit from handle because Java objects are reference-y
properties
jWrappedArray % holds a DynamicFieldAccessArrayWrapper
end
methods
function varargout = subsref(obj, s)
if isequal(s(1).type, '()')
indices = s(1).subs;
s(1) = [];
else
indices = [];
end
% TODO: check for unsupported indexing types in remaining s
fieldNameChain = parseFieldNamesFromArgs(s);
out = getNestedFields( jWrappedArray, indices, fieldNameChain );
varargout = unpackResultsAndConvertIfNeeded(out);
end
end
end
The overhead involved in marshalling and unmarshalling the values for the subsasgn call would probably overwhelm any speed gain from the Java bits.
You could probably eliminate that overhead by replacing your M-code implementation of subsasgn
with a MEX implementation that does the structure marshalling and unmarshalling in C, using JNI to build the Java objects, call getNestedFields, and convert the result to Matlab structures. This is way beyond what I could give an example for.
If this looks a bit horrifying to you, I totally agree. You're bumping up against the edges of the language here, and trying to extend the language (especially to provide new syntactic behavior) from userland is really hard. I wouldn't seriously do something like this in production code; just trying to outline the area of the problem you're looking around.
Are you dealing with homogeneous arrays of these deeply nested structures? Maybe it would be possible to convert them to "planar organized" structures, where instead of an array of structs with scalar fields, you have a scalar struct with array fields. Then you can do vectorized operations on them in pure M-code. This would make things a lot faster, especially with save
and load
, where the overhead scales per mxarray.