6

I have a similar question here
That question is about resemble the tiff data, and I want to find a more universal way to deal with it.

My question

For example:

  • A 2-d numpy array with the shape of 200x150 represent the 1 km x 1 km resolution population density data.

http://i4.tietuku.com/060fe38e9ccf6daf.png

  • My target: change the space resolution => 5 km x 5 km resolution

this is an example picturefor random distributed data cluster into grid network http://i4.tietuku.com/a4c1355413f8cdbc.png
* the red point: original data
* the blue dot: grid network represent the 2-d array * the green circle: find the nearest blue dot for each red point and sum them.
* In this question, the difference is that the original data is 2-d numpy array too.

My solution

  • Similar with my another question here which I cluster 2-d scatter point to nearest grid point. And I appreciate the answer supported by @HYRY which improved my code a lot.

  • In that question, I use KD-tree algorithm to find the nearest network node of each original point data. The result shows here:
    http://i4.tietuku.com/1a420c48ed7bcb1c.png

  • I think there must be some easier way to reshape the structured 2-d numpy array rather than random 2-d scatter point.

Add 2016-01-09

Thanks for the answer from @Praveen.
I have another method using scipy interpolate 2d function.

This is my code:

 xi  = np.linspace(x_map1,x_map2,pop.shape[1])
 yi  = np.linspace(y_map1,y_map2,pop.shape[0])
 hfunc = interpolate.interp2d(xi,yi,pop)

 x_grid  = np.linspace(x_map1,x_map2,new_shape_x)
 y_grid  = np.linspace(y_map1,y_map2,new_shape_y)

 new_pop = np.zeros(new_shape_x * new_shape_y)
 t = 0
 for i in range(0,new_shape_y,1):
     for j in range(0,new_shape_y,1):
         new_pop[t] = hfunc(x_grid[j],y_grid[i])
         t+=1
 new_pop = new_pop.reshape(new_shape_y,new_shape_x)
 plt.pcolormesh(new_pop)

The result shows like:
http://i4.tietuku.com/b020db6dc2d75d70.png

  • Are there some problems when I use interpolate to coarser the data?

Add 2

Is there some useful function that I can sample some data from origin array dataset by location(x,y)?

Community
  • 1
  • 1
Han Zhengzu
  • 3,694
  • 7
  • 44
  • 94

1 Answers1

9

If I understood you correctly, you have a very fine population density map, which you're trying to make coarse, by aggregating population densities within every 5x5 pixel zone. Is that right?

So when you say you're trying to make 1km x 1km into 5km x 5km you mean that each pixel currently represents population in a 1km x 1km region, whereas you want to make it represent population in a 5km x 5km region.

If so, please don't use clustering and KD-trees! That would be a horribly inefficient way to do something far simpler.

This is probably what you want. To explain:

# Suppose the 2D array is pop_density
coarseness = 5
temp = pop_density.reshape((pop_density.shape[0] // coarseness, coarseness,
                            pop_density.shape[1] // coarseness, coarseness))
coarse_pop_density = np.sum(temp, axis=(1,3))

As stated in the other answer, this will only work if the shape of pop_density is an exact multiple of coarseness. I believe this is the case for you, since you say you have a 200x150 image, which you're trying to make coarse by a factor of 5.

For images that are not a multiple of the coarseness factor

# Suppose the size of pop_density was 198x147 instead of 200x150.
# Start by finding the next highest multiple of 5x5
shape = np.array(pop_density.shape, dtype=float)
new_shape = coarseness * np.ceil(shape / coarseness).astype(int)
# new_shape is now (200, 150)

# Create the zero-padded array and assign it with the old density
zp_pop_density = np.zeros(new_shape)
zp_pop_density[:shape[0], :shape[1]] = pop_density

# Now use the same method as before
temp = zp_pop_density.reshape((new_shape[0] // coarseness, coarseness,
                               new_shape[1] // coarseness, coarseness))
coarse_pop_density = np.sum(temp, axis=(1,3))
Leon
  • 31,443
  • 4
  • 72
  • 97
Praveen
  • 6,872
  • 3
  • 43
  • 62
  • Thanks! So this method can't be used when the original shape is not aliquant by the coarseness factor? – Han Zhengzu Jan 09 '16 at 05:47
  • 1
    You wouldn't be able to do it in such a neat way. You would have to handle the edge-cases separately. I think I would probably do it by zero-padding. I'll amend the answer in a minute. – Praveen Jan 09 '16 at 05:48
  • @AlecMcGail I think that if the original shape is divisible by the coarseness factor, then you could probably take `np.mean` instead of `np.sum` and you should be fine (assuming that you are averaging by area, i.e.). If not, you would have to do something different for the edge-cases, since zero-padding will affect the average. – Praveen Nov 08 '20 at 20:48
  • Yup exactly. Amazed to see you after four years haha! – Alec McGail Nov 09 '20 at 21:17