1

I have a numpy structured array that holds all my data:

temp = np.array([(0,0,0,0,0),(0,0,0,0,0),(0,0,0,0,0), \ 
                     (0,0,0,0,0),(0,0,0,0,0)],
dtype = [('x_pos','f4'),('y_pos','f4'),('z_pos','f4'), \
                     ('x_exp','f4'),('y_exp','f4')])

filled then merged together like this:

if use_event:
    if len(points) > 0:
        points = np.vstack((points,temp))
    else:
        points = temp

This works well, I end up with an array of 5 different points in (x,y,z,0,0). What I would like to do is to preform a least linear squares fit with each permutation of 4 of these points, so I can eventually find the expected value of each point.

Following the documentation I suspect I can use the linear least squares function by doing:

x_array = points['x_pos']
y_array = points['y_pos']
A = np.vstack([x_array, np.ones(len(x_array))]).T
m, c = np.linalg.lstsq(A, y)[0]

Though I might be forced to throw an iterator in there somewhere.

But this is for all five points rather than four of the five. Before I used numpy I was using a deepcopy of large python dictionaries and "pop"-ing the element I didn't want, but this is slow and was eating up tons of memory as I iterated repeatedly. Does anyone have a clear suggestion on how I can do this here?

A short clip of my data looks like this:

points:

[[(-76.3629379272461, 13.817119598388672, 15062.244140625, 0.0, 0.0) 
 (-77.6500473022461, 11.30630874633789, 14861.1728515625, 0.0, 0.0)
 (-82.07966613769531, 4.440931797027588, 14612.07421875, 0.0, 0.0)
 (-90.221435546875, -4.660646915435791, 14312.16796875, 0.0, 0.0)
 (-101.93490600585938, -19.41192054748535, 13962.134765625, 0.0, 0.0)]
[(28.65045738220215, -38.633392333984375, 15062.1591796875, 0.0, 0.0)
 (14.097150802612305, -36.76347732543945, 14861.2138671875, 0.0, 0.0)
 (-9.500401496887207, -40.40631866455078, 14612.1259765625, 0.0, 0.0)
 (-35.776180267333984, -44.91004180908203, 14312.1669921875, 0.0, 0.0)
 (-69.49923706054688, -51.34267044067383, 13962.0263671875, 0.0, 0.0)]
[(25.63878059387207, 47.59636306762695, 15062.5927734375, 0.0, 0.0)
 (27.440467834472656, 43.73832702636719, 14861.8408203125, 0.0, 0.0)
 (25.880281448364258, 32.709007263183594, 14613.0322265625, 0.0, 0.0)
 (21.14924430847168, 22.327594757080078, 14313.4365234375, 0.0, 0.0)
 (17.18960952758789, 10.372784614562988, 13963.8125, 0.0, 0.0)] ....

and points['x_pos']:

[[ -76.36293793  -77.6500473   -82.07966614  -90.22143555 -101.93490601]
 [  28.65045738   14.0971508    -9.5004015   -35.77618027  -69.49923706]
 [  25.63878059   27.44046783   25.88028145   21.14924431   17.18960953] ...
LaPriWa
  • 1,787
  • 1
  • 12
  • 19
Chris
  • 21
  • 3
  • Since all the fields have the same `dtype`, you should be able to view this as a 2d array, Try `temp.view(float,5)`. http://stackoverflow.com/questions/36485619/how-do-i-load-heterogeneous-data-np-genfromtxt-as-a-2d-array – hpaulj Apr 20 '16 at 16:47
  • Actually the answer was really simple `delete` is what I wanted! The second example from [documentation](http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.delete.html), shows how to delete a column. The third argument is the axis, and apparently that means in a 2D array, "1" is the column. I kind of wish I would have done more research before asking, but maybe it will help someone else... – Chris Apr 20 '16 at 18:00

1 Answers1

1

Well I figured it out. I need to use the delete command. The third argument of delete specifies the axis to delete, in a 2D array this is the column. So I could do:

for cut_point in range(5):
  x_trim = np.delete(x_array,cut_point,1)
  y_trim = np.delete(y_array,cut_point,1)

  Do stuff
  .....

I really wish I would have done a little more research before posting this, turns out it was a pretty simple solution. Maybe someone can benefit from it in the future though...

Chris
  • 21
  • 3