5

Edit: Original question was flawed but I am leaving it here for reasons of transparency.

Original: I have some x, y, z data where x and y are coordinates of a 2D grid and z is a scalar value corresponding to (x, y).

>>> import numpy as np
>>> # Dummy example data 
>>> x = np.arange(0.0, 5.0, 0.5)
>>> y = np.arange(1.0, 2.0, 0.1)
>>> z = np.sin(x)**2 + np.cos(y)**2
>>> print "x = ", x, "\n", "y = ", y, "\n", "z = ", z
x =  [ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5] 
y =  [ 1.   1.1  1.2  1.3  1.4  1.5  1.6  1.7  1.8  1.9] 
z =  [ 0.29192658  0.43559829  0.83937656  1.06655187  0.85571064  0.36317266
  0.02076747  0.13964978  0.62437081  1.06008127]

Using xx, yy = np.meshgrid(x, y) I can get two grids containing x and y values corresponding to each grid position.

>>> xx, yy = np.meshgrid(x, y)
>>> print xx
[[ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5]
 [ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5]
 [ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5]
 [ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5]
 [ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5]
 [ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5]
 [ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5]
 [ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5]
 [ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5]
 [ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5]]
>>> print yy
[[ 1.   1.   1.   1.   1.   1.   1.   1.   1.   1. ]
 [ 1.1  1.1  1.1  1.1  1.1  1.1  1.1  1.1  1.1  1.1]
 [ 1.2  1.2  1.2  1.2  1.2  1.2  1.2  1.2  1.2  1.2]
 [ 1.3  1.3  1.3  1.3  1.3  1.3  1.3  1.3  1.3  1.3]
 [ 1.4  1.4  1.4  1.4  1.4  1.4  1.4  1.4  1.4  1.4]
 [ 1.5  1.5  1.5  1.5  1.5  1.5  1.5  1.5  1.5  1.5]
 [ 1.6  1.6  1.6  1.6  1.6  1.6  1.6  1.6  1.6  1.6]
 [ 1.7  1.7  1.7  1.7  1.7  1.7  1.7  1.7  1.7  1.7]
 [ 1.8  1.8  1.8  1.8  1.8  1.8  1.8  1.8  1.8  1.8]
 [ 1.9  1.9  1.9  1.9  1.9  1.9  1.9  1.9  1.9  1.9]]

Now I want an array of the same shape for z, where the grid values correspond to the matching x and y values in the original data! But I cannot find an elegant, built-in solution where I do not need to re-grid the data, and I think I am missing some understanding of how I should approach it.

I have tried following this solution (with my real data, not this simple example data, but it should have the same result) but my final grid was not fully populated. Please help!

Corrected question:

As was pointed out by commenters, my original dummy data was unsuitable for the question I am asking. Here is an improved version of the question:

I have some x, y, z data where x and y are coordinates of a 2D grid and z is a scalar value corresponding to (x, y). The data is read from a text file "data.txt":

#x y z
1.4 0.2 1.93164166734
1.4 0.3 1.88377897779
1.4 0.4 1.81946452501
1.6 0.2 1.9596778849
1.6 0.3 1.91181519535
1.6 0.4 1.84750074257
1.8 0.2 1.90890970517
1.8 0.3 1.86104701562
1.8 0.4 1.79673256284
2.0 0.2 1.78735230743
2.0 0.3 1.73948961789
2.0 0.4 1.67517516511

Loading the text:

>>> import numpy as np
>>> inFile = 'C:\data.txt' 
>>> x, y, z = np.loadtxt(inFile, unpack=True, usecols=(0, 1, 2), comments='#', dtype=float)
>>> print x
[ 1.4  1.4  1.4  1.6  1.6  1.6  1.8  1.8  1.8  2.   2.   2. ]
>>> print y
[ 0.2  0.3  0.4  0.2  0.3  0.4  0.2  0.3  0.4  0.2  0.3  0.4]
>>> print z
[ 1.93164167  1.88377898  1.81946453  1.95967788  1.9118152   1.84750074
  1.90890971  1.86104702  1.79673256  1.78735231  1.73948962  1.67517517]

Using xx, yy= np.meshgrid(np.unique(x), np.unique(y)) I can get two grids containing x and y values corresponding to each grid position.

>>> xx, yy= np.meshgrid(np.unique(x), np.unique(y))
>>> print xx
 [[ 1.4  1.6  1.8  2. ]
 [ 1.4  1.6  1.8  2. ]
 [ 1.4  1.6  1.8  2. ]]
>>> print yy
[[ 0.2  0.2  0.2  0.2]
 [ 0.3  0.3  0.3  0.3]
 [ 0.4  0.4  0.4  0.4]]

Now each corresponding cell position in both xx and yy correspond to one of the original grid point locations. I simply need an equivalent array where the grid values correspond to the matching z values in the original data!

"""e.g. 
[[ 1.93164166734  1.9596778849  1.90890970517  1.78735230743]
 [ 1.88377897779  1.91181519535  1.86104701562  1.73948961789]
 [ 1.81946452501  1.84750074257  1.79673256284  1.67517516511]]"""

But I cannot find an elegant, built-in solution where I do not need to re-grid the data, and I think I am missing some understanding of how I should approach it. For example, using xx, yy, zz = np.meshgrid(x, y, z) returns three 3D arrays that I don't think I can use.

Please help!

Edit: I managed to make this example work thanks to the solution from Jaime: Fill 2D numpy array from three 1D numpy arrays

>>> x_vals, x_idx = np.unique(x, return_inverse=True)
>>> y_vals, y_idx = np.unique(y, return_inverse=True)
>>> vals_array = np.empty(x_vals.shape + y_vals.shape)
>>> vals_array.fill(np.nan) # or whatever your desired missing data flag is
>>> vals_array[x_idx, y_idx] = z
>>> zz = vals_array.T
>>> print zz

But the code (with real input data) that led me on this path was still failing. I found the problem now. I have been using scipy.ndimage.zoom to resample my gridded data to a higher resolution before generating zz.

>>> import scipy.ndimage
>>> zoom = 2
>>> x =  scipy.ndimage.zoom(x, zoom)
>>> y =  scipy.ndimage.zoom(y, zoom)
>>> z =  scipy.ndimage.zoom(z, zoom)

This produced an array containing many nan entries:

array([[ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       ..., 
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan]])

When I skip the zoom stage, the correct array is produced:

array([[-22365.93400183, -22092.31794674, -22074.21420168, ...,
        -14513.89091599, -12311.97437017, -12088.07062786],
       [-29264.34039242, -28775.79743097, -29021.31886353, ...,
        -21354.6799064 , -21150.76555669, -21046.41225097],
       [-39792.93758344, -39253.50249278, -38859.2562673 , ...,
        -24253.36838785, -25714.71895023, -29237.74277727],
       ..., 
       [ 44829.24733543,  44779.37084337,  44770.32987311, ...,
         21041.42652441,  20777.00408692,  20512.58162671],
       [ 44067.26616067,  44054.5398901 ,  44007.62587598, ...,
         21415.90416488,  21151.48168444,  20887.05918082],
       [ 43265.35371973,  43332.5983711 ,  43332.21743471, ...,
         21780.32283309,  21529.39770759,  21278.47255848]])
Community
  • 1
  • 1
feedMe
  • 3,431
  • 2
  • 36
  • 61
  • 2
    Write exactly must be the expected output for the given sample data, could you add it to the question? – Divakar Nov 20 '15 at 11:27
  • I *think* what you want is simply `z = np.sin(xx)**2 + np.cos(yy)**2`. Because `xx` and `yy` are 2D, `z` will also be 2D. If you are trying to achieve that without creating the 2D arrays `xx` and `yy`, use *broadcasting*: `z = np.sin(x)**2 + np.cos(y.reshape(-1, 1))**2`. – Warren Weckesser Nov 20 '15 at 12:14
  • @WarrenWeckesser Thanks for the suggestion but that would only work if the z data is actually calculated from an equation. My z data is obtained from an xyz text file, so in this case it will already be in a 1-D array. – feedMe Nov 20 '15 at 12:36
  • @feedMe Then what you're asking for doesn't really make sense - if `x` and `y` are the unique column and row values for an `(ny, nx)` 2D grid, then there must be at least `nx * ny` `z`-values. Your example `z`-values aren't actually sampled on a regular 2D grid, so there is no way to generate a "fully populated" grid from them without doing some sort of interpolation. – ali_m Nov 20 '15 at 13:08
  • @WarrenWeckesser Yes I realized my question was flawed and am editing... sorry.... – feedMe Nov 20 '15 at 13:47
  • @Divakar thanks for the suggestion. I modified my question and this time included some expected output. – feedMe Nov 20 '15 at 14:37
  • @feedMe Thanks, that's much clearer. – Warren Weckesser Nov 20 '15 at 14:50
  • If your `x` and `y` arrays already contain all of the locations in your grid then there's no point in finding the unique `x` and `y` values and generating another 2D grid from them using `np.meshgrid`. Instead, you just want to sort your `z`-values by ascending `y` and `x`, then reshape them into a 2D array ([see here, for example](http://stackoverflow.com/q/32129572/1461210)). – ali_m Nov 20 '15 at 14:50
  • @ali_m You're right that I don't really need the x and y values in a grid, but I do need the z values in one, because I will be giving the data to scipy.interpolate.interp2d and that expects the z values in a grid. – feedMe Nov 20 '15 at 15:11
  • Yes, the question/answer I linked you to shows how to reshape your `z` data in order to achieve this. You still don't need `np.meshgrid`. – ali_m Nov 20 '15 at 15:47
  • How about just a matrix multiplication. Say `zz = z.T*z`? – Rakshit Kothari Nov 14 '21 at 04:22

0 Answers0