I have a seemingly simple problem, but an easy solution is alluding me. I have a very large series (tens or hundreds of thousands of points), and I just need to visualize it at different zoom levels, but generally zoomed well out. Basically, I want to plot it in a tool like Matlab or Pyplot, but knowing that each pixel can't represent the potentially many hundreds of points that map to it, I'd like to see both the min and the max of all the array entries that map to a pixel, so that I can generally understand what's going on. Is there a simple way of doing this?
-
does it have to be a matlab type app or can you open your data in a webpage visualization? http://d3js.org/ – Chris Nov 03 '12 at 22:12
-
I'm totally open to any option that's really straightforward. I've never used d3 before, if it's a quick snippet, can you post an answer? – acjay Nov 03 '12 at 22:29
-
Give gephi or cytoscape a shot first. – Chris Nov 03 '12 at 22:57
4 Answers
Try hexbin. By setting the reduce_C_function
I think you can get what you want. Ex:
import matplotlib.pyplot as plt
import numpy as np
plt.hexbin(x,y,C=C, reduce_C_function=np.max) # C = f(x,y)
would give you a hexagonal heatmap where the color in the pixel is the maximum value in the bin.
If you only want to bin in one direction, see this this method.
First option you may want to try is Gephi- https://gephi.org/
Here is another option, though I'm not quite sure it will work. It's hard to say without seeing the data.
Try going to this link- http://bl.ocks.org/3887118. Do you see toward the bottom of the page data.tsv with all of the values? IF you can save your data to resemble this then the HTML code above should be able to build your data in the scatter plot example shown in that link. Otherwise, try visiting this link to fashion your data to a more appropriate web page.

- 18,075
- 15
- 59
- 77
-
I'm not really looking to make a scatter plot. That will be too many points. Basically I just want a plot that max-bins and min-bins my data points, if that makes sense. – acjay Nov 04 '12 at 17:41
There are a set of research tools called TimeSearcher 1--3 that provide some examples of how to deal with large time-series datasets. Below are some example images from TimeSearcher 2 and 3.

- 949
- 5
- 8
I realized that simple plot()
in MATLAB actually gives me more or less what I want. When zoomed out, it renders all of the datapoints that map to a pixel column as vertical line segments from the minimum to the maximum within the set, so as not to obscure the function's actual behavior. I used area()
to increase the contrast.

- 34,571
- 6
- 57
- 100