I have the following demonstration code where I create a simple scatter plot and save it as png
, fully vectorized eps
and partly rasterized eps
.
For a large number of points I expect the filesize of the vectorized eps
to be much bigger than the png
(at least at reasonable dpi), and this is indeed what I observe.
When I rasterize the scatter plot, I would expect the filesize to get back down towards the size of the png
, since I'm practically just "embedding" the png in an eps
, right? However, the rasterized version completely bloats up by a factor of ~20:
png
: 48K, fully vectorized eps
: 184K, rasterized eps
: 3.8M (on Linux openSUSE, python 3.4.6
, matplotlib 2.2.2
)
What's the reason for this? Is my understanding of what happens when one rasterizes the plot completely wrong? When I put the png
into inkscape and export as eps
I get a file (which is obviously rasterized) of only minutely larger size than the original png
.
Demonstration code:
import matplotlib.pyplot as plt
import numpy as np
# Prepare some random data
N = 10000
x = np.random.rand(N)
y = np.random.rand(N)
dpi = 150
# Create a figure and plot some points
fig = plt.figure()
ax = fig_mesh.add_subplot(111)
scatter = ax.scatter(x, y, zorder=0.5)
# Save it as png or unrasterized eps
fig_mesh.savefig('mesh.png', dpi=dpi) # 184K
fig_mesh.savefig('mesh.eps') # 48 K
# Save it with rasterized points
ax_mesh.set_rasterization_zorder(1)
fig_mesh.savefig('mesh_rasterized.eps', dpi=dpi, rasterized=True) # 3.8M!
Thanks in advance!