1

I'm using scipy ODR to perform a Othogonal Linear Regression. I have a matrix of shape (nlines,3) representing trajectory coordinates, that is the columns are the x,y,z coordinates of a point moving in space.

The goal is to find the straight line that best approximate/fits the trajectory (similarly as asked here). Hence, the output straight line has the same shape as input: (nlines,3).

Problem: what Model should I use for the intended goal? I'm trying odr.multilinear but I get an error.

Following the example given in the documentation with minor modification

# traj_data is my 2D data matrix with x,y,z coordinates
# create the array for the independent variable
nsamples = np.arange(0,traj_data.shape[0])
nsamples_3d = np.column_stack((nsamples, nsamples, nsamples))

# Define the function you want to fit against.:
linear = odr.multilinear   # is this the correct Model to use?

# Create a Data instance.:
mydata = odr.Data(nsamples_3d, traj_data, wd=1, we=1)

# Instantiate ODR with your data, model and initial parameter estimate.:
myodr = odr.ODR(mydata, linear, beta0=[1., 2.])

# Run the fit.:
myoutput = myodr.run()

However, execution stops at

myodr = odr.ODR(mydata, linear, beta0=[1., 2.])

with the error

scipy.odr.odrpack.OdrError: fcn does not output [11700, 3]-shaped array

(where 11700 is nlines in the specific example I ran)

I have no problem when using 1D data.

Am I doing something wrong?

Robyc
  • 379
  • 1
  • 13
  • What is the shape of traj_data, that is, what is returned by "numpy.shape(traj_data)"? The output of odr's "linear" is [number_of_samples, 1], so the odr module will be looking for that shape in traj_data. The error message "fcn does not output [11700, 3]" indicates to me that the function you are using, odr's "linear", will output a result with the shape [11700, 1] so there is a shape mismatch. ...I think. – James Phillips Aug 29 '19 at 14:37
  • numpy.shape(traj_data) = [11700, 3]. What you mean by "odr's linear"? the function I called linear is an instance of odr.multilinear – Robyc Aug 29 '19 at 14:54
  • just been playing with this method a bit and it looks as though it'll run for at least a day with that much data. it seems to take 10 times as long to run when I double the rows, and 800 rows takes ~20 seconds. so 20*10**4 = ~2 days – Sam Mason Aug 29 '19 at 15:21
  • I appreciate the comment but It doesn't seem to help solving the issue I presented. Does it? :) Anyway, if I apply the odr to a single column data it takes less than a second with 11700 lines. Where is your 10**4 coming from? – Robyc Aug 29 '19 at 16:17
  • According to the source code for odr's multilinear at https://github.com/scipy/scipy/blob/v1.3.0/scipy/odr/models.py the output of multilinear (the "fcn" in the error message) in this case will be [11700, 1] - a single vector of "y" values - which it cannot compare with your traj_data of shape [11700, 3]. traj_data should be a single vector of expected values, that is, of shape [11700, 1]. – James Phillips Aug 29 '19 at 16:36
  • so maybe multilinear is not correct? Then the problem is: what (linear) function should I use so that ODR performs the operation I need. For 1D data I'd just use linear=odr.linear – Robyc Aug 29 '19 at 16:42

0 Answers0