1

I am having two problems, both related to memory problems. The first case happens when there are about 5 resolvers used (see code and explanation below), the second when there are used about 15 resolvers. To the first problem there is a similar question on Stackoverflow. The solution to this problem was to clear the memory after each loop but I want to create mutliple datalines in a single graph, so this doesn't work for me.

Here is the code snippet where all that happens:

fig = plt.figure()
ax = fig.add_subplot(111)

def add_plot(resolver_name, results):
    sum_results = sum(results)
    norm = [float(i)/sum_results for i in results]
    cy = np.cumsum(norm)
    ax.plot(results, cy, label=resolver_name, linewidth=0.8)


for resolver in resolvers:
    results = db.get_rt(resolver["ipv4"], tls)
    add_plot(resolver["name"], results)        

# Positioning of legend
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 0.8, box.height])
ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))
fig.set_size_inches(10,5)

ax.set_xscale('log')
plt.title('CDF response time for '+('DNS-over-TLS measurements' if tls else 'DNS measurements'))
plt.xlabel("Response time (ms)")
plt.ylabel("CDF")
plt.grid(True)

png_name = V.base_directory+"/plots/rt_cdf.png"
if (tls):
    png_name = V.base_directory+"/plots/rt_cdf_tls.png"
log.info("Plotting graph to "+png_name)
plt.savefig(png_name)

The variable resolvers contains some information about several public DNS resolvers. The variable results is a list of float values. All other unclear variables should not be relevant to this problem. But feel free to ask if you need further explanation.

Problem 1

As said this happens when there are used about 5 resolvers. The size of results varies between ~1 million and ~6 million entries. A MemoryError occurs at the last line:

Traceback (most recent call last):
File "plot_building/rt_cdf.py", line 63, in <module>
    plt.savefig(png_name)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/pyplot.py", line 695, in savefig
    res = fig.savefig(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/figure.py", line 2062, in savefig
    self.canvas.print_figure(fname, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backend_bases.py", line 2263, in print_figure
    **kwargs)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/backend_agg.py", line 517, in print_png
    FigureCanvasAgg.draw(self)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/backend_agg.py", line 437, in draw
    self.figure.draw(self.renderer)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/figure.py", line 1493, in draw
    renderer, self, artists, self.suppressComposite)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/image.py", line 141, in _draw_list_compositing_images
    a.draw(renderer)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/axes/_base.py", line 2635, in draw
    mimage._draw_list_compositing_images(renderer, self, artists)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/image.py", line 141, in _draw_list_compositing_images
    a.draw(renderer)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/lines.py", line 756, in draw
    tpath, affine = (self._get_transformed_path()
File "/usr/local/lib/python2.7/dist-packages/matplotlib/transforms.py", line 2848, in get_transformed_path_and_affine
    self._revalidate()
File "/usr/local/lib/python2.7/dist-packages/matplotlib/transforms.py", line 2822, in _revalidate
    self._transform.transform_path_non_affine(self._path)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/transforms.py", line 2492, in transform_path_non_affine
    return self._a.transform_path_non_affine(path)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/transforms.py", line 1564, in transform_path_non_affine
    x = self.transform_non_affine(path.vertices)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/transforms.py", line 2271, in transform_non_affine
    return np.concatenate((x_points, y_points), 1)
MemoryError

Problem 2

This was a bit harder to figure out. At some point during the runtime the process just stopped. After some searching I found the following in var/log/syslog

[27578124.494907] Out of memory: Kill process 376 (python) score 897 or sacrifice child
[27578124.495020] Killed process 376 (python) total-vm:2081432kB, anon-rss:1833416kB, file-rss:1464kB

I think some other lines in the logfile might also belong to this problem but what I've found is that this problem is caused by having not enough RAM.


The script is running on a Ubuntu VM with 2GB RAM.

Any ideas how I could fix any of those problems?

Ian Fako
  • 1,148
  • 1
  • 15
  • 34
  • 1
    Have you considered plotting _less than 6 million datapoints_ in each line plot? At 600dpi resolution, your figure would have to be 10.000inch wide in order to discern individual datapoints, and I suppose that's not what you're going for. – Asmus Jul 22 '19 at 12:53
  • @Asmus this would obviously be a simple solution but this project is for a scientific work and I'm not sure if and how I would be able to reduce the amount of datapoints and still keep everything correct – Ian Fako Jul 22 '19 at 12:59
  • 1
    I hope you are aware that you are right now simply leaving the required reduction of datapoints over to the specific combination of matplotlib backend, monitor, and printer? The amount of data that e.g. your monitor can accurately represent is _way less_ than 6 million pixels in width! – Asmus Jul 22 '19 at 13:02
  • @Asmus since the graph represents a CDF the lines are rising monotonously, so this should not be much of a problem. – Ian Fako Jul 22 '19 at 13:08
  • 1
    Isn't that precisely the point from above? If the line rises monotonously, then you can just plot only every 10th point (or even every 100th!!) without seeing a difference on screen. – ImportanceOfBeingErnest Jul 22 '19 at 14:56
  • @ImportanceOfBeingErnest oh man, didn't quite got that at first but this might work. Will test it tomorrow. – Ian Fako Jul 22 '19 at 19:34

1 Answers1

0

Are you watching your system monitor whilst running this? Are you running out of RAM?

6 million points seems huge can you not just sample less?

j.t.2.4.6
  • 178
  • 1
  • 11
  • For problem 2 yes, I am running out of RAM. Could probably work around this one by switching to a machine with more RAM. See comment section under the question for the other question. – Ian Fako Jul 22 '19 at 13:03
  • I doubt that problem 1 and 2 aren't linked. You need to get this working with a subset of the data. Plot every 1000 points even if it makes your results useless to find out if it works. Once you have done that you will be able to tell if you need to just chuck this onto a machine with more RAM or whether your code is broken somewhere else. Right now this looks to me like a memory issue is causing both problems. – j.t.2.4.6 Jul 22 '19 at 15:46