0

I have some source data that isn't regularized (sample shown on csv variable on code below). In this data I can't garantee any minimum, maximum or step values. Therefore I need to find out on source data.

After reading the data, and defined the necessary values to plot my image I came with the loop below. Running this code reading (150k lines) like that showed that the code is pretty slow, took me around a 110 seconds (!!!) to render the whole image (a very small image).

Any hints are welcome, even if I have to use other libraries or data types. My main objective is to show up "heat maps" from csv sources like those that can span for a million lines. Reading the file into the dataset o plotting the graph is fast. The issue is create the image map from the csv.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import io

csv = """
"X","Y","V"
1001,1001,909.630432
1001,1003,940.660156
1001,1005,890.571594
1001,1007,999.651062
1001,1009,937.775513
1003,1002,937.601074
1003,1004,950.006897
1003,1006,963.458923
1003,1008,878.646851
1003,1012,956.835938
1005,1001,882.472656
1005,1003,857.491028
1005,1005,907.293335
1005,1007,877.087891
1005,1009,852.005554
1007,1002,880.791931
1007,1004,862.990967
1007,1006,882.135864
1007,1008,896.634521
1007,1010,888.916626
1013,1001,853.410583
1013,1003,863.324341
1013,1005,843.284607
1013,1007,852.712097
1013,1009,882.543640
"""

data=io.StringIO(csv)

columns = [ "X" , "Y", "V" ]

df = pd.read_csv(data, sep=',', skip_blank_lines=True, quoting=2, skipinitialspace=True, usecols = columns, index_col=[0,1] ) 

# Fields
x_axis="X"
y_axis="Y"
val="V"

# Unique values on the X-Y axis
x_ind=df.index.get_level_values(x_axis).unique()
y_ind=df.index.get_level_values(y_axis).unique()

# Size of each axis
nx = len(x_ind)
ny = len(y_ind)

# Maxima and minima
xmin = x_ind.min()
xmax = x_ind.max()
ymin = y_ind.min()
ymax = y_ind.max()

img = np.zeros((nx,ny))

print "Entering in loop"
for ix in range(0, nx):
    print "Mapping {0} {1}".format( x_axis, ix )
    for iy in range(0, ny):
        try:
            img[ix,iy] = df.loc[ix+xmin,iy+ymin][val]
        except KeyError:
            img[ix,iy] = np.NaN

plt.imshow(img, extent=[xmin, xmax, ymin, ymax], cmap=plt.cm.jet, interpolation=None)
plt.colorbar()
plt.show()

Tried to use pcolormesh, but was not able to correctly fit the values into the mesh without use a similar loop. I was not able to create the z_mesh without the loop

x_mesh,y_mesh = np.mgrid[xmin:xmax,ymin:ymax]
z_mesh = ?? hints ?? ;-)
Lin
  • 1,145
  • 11
  • 28

1 Answers1

1

I think your code is not even doing what you want, I ran it and got only 14 valid points in the image.

You may use pivot() or unstack() and then reindex() to create the image. Is this what you want?

data=io.StringIO(csv)
df = pd.read_csv(data, sep=',', skip_blank_lines=True, quoting=2,
                 skipinitialspace=True, usecols = columns)
img = df.pivot(index='Y', columns='X', values='V')
img = img.reindex(index=range(df['Y'].min(), df['Y'].max() + 1),
                  columns=range(df['X'].min(), df['X'].max() + 1))

extent = [df['X'].min() - 0.5, df['X'].max() + 0.5,
          df['Y'].min() - 0.5, df['Y'].max() + 0.5]
plt.imshow(img, origin='lower', extent=extent)
plt.colorbar()

enter image description here

Stop harming Monica
  • 12,141
  • 1
  • 36
  • 56
  • Yeah. Is exactly that. I will read about pivot, unstack and reindex. I will try to understand why my code isn't doing what I expected. Why imshow display those "blurry" points instead sharp dots? (even using interpolation=None. – Lin Feb 04 '16 at 22:45
  • You want `interpolation='none'`. It is common in the matplotlib API: `'none'` means nothing, `None` means some default value. Pay attention to the docstrings. – Stop harming Monica Feb 04 '16 at 23:02
  • I think I'm using an old matplotlib version because interpolation='none', Gives me a ValueError: Illegal interpolation string In [42]: matplotlib.__version__ Out[42]: '0.99.3' – Lin Feb 04 '16 at 23:08
  • @Lin very very old, I urge you to upgrade. – Stop harming Monica Feb 05 '16 at 11:09