0

I'm tasked with finding the mean center of a feature class using a numpy array. I have created a numpy array from the feature class using

import arcpy
import numpy
fc = "polygons.shp"
a = arcpy.da.FeatureClassToNumPyArray(fc, ["SHAPE@X", "SHAPE@Y"])

the array, a, is then:

array([( 3107178.29076947,  10151024.31186805),
       ( 3107961.30479125,  10139810.52458512),
       ( 3109603.8882401 ,  10119654.26424824),
       ( 2992362.40598316,  10049723.50515586),
       ....
       ( 3114517.82381449,  10071634.68261757)],
       dtype=[('SHAPE@X', '<f8'), ('SHAPE@Y', '<f8')])

which is the centroid (X,Y) of each record in fc. How do I get the meanX and meanY of these so the output would be ([(mean.X, mean.Y)])? I've tried using the following, as described here:

numpy.mean(a, axis=0)

but I get the mean of just the X values. Is there some additional step with changing dtype after the arcpy.da function to successfully get both the mean.X, mean.Y values? I have to do this using the numpy mean function. Thanks!

Alex R
  • 115
  • 3

3 Answers3

2
np.mean(a.view((float, len(a.dtype.names))), axis=0)
AGN Gazer
  • 8,025
  • 2
  • 27
  • 45
1
meanxy=[np.mean(y) for y in zip(*a)]

* collects all the positional arguments in a tuple

0

perhaps a bit of overkill, but with structured and/or recarrays coming from featureclasses, you should be cautioned about the mix of data types that exist. The mix of integer, float and string will cause errors if a blanket upscaling to float is done. You might be advised to perform the actual calculation on the fields that you want individually or of a particular dtype at the same time. Consider a featureclass with simply the following dtype:

a.dtype.names = ('ID', 'X', 'Y', 'Z')

The average 'ID' is pretty useless... however, the average of the 3D coordinates may be. To just get the averages of those coordinates you can do them as singletons.

    a['X'].mean(), a['Y'].mean(), a['Z'].mean()
    (74047.105809675646, -3466195.1836807081, 418.45351408062925)

or as a batch of unknown length of floats

[a[i].mean() for i in a.dtype.names if a[i].dtype.kind in ('f', 'float')]

yielding the same as a tuple

 [74047.105809675646, -3466195.1836807081, 418.45351408062925]

and to ensure you remember what value is what...

   [(i, a[i].mean()) for i in a.dtype.names if a[i].dtype.kind in ('f', 'float')]

[('X', 74047.105809675646),
 ('Y', -3466195.1836807081),
 ('Z', 418.45351408062925)]
NaN
  • 2,212
  • 2
  • 18
  • 23