8

I am trying to delete some old graphite test whisper metrics without any success. I can delete the metrics by removing the files. (See: How to cleanup the graphite whisper's data? ) But, within a few seconds of blowing away the files they regenerate (they are empty of metrics and stay that way since nothing is creating new metrics in those files). I've tried stopping carbon (carbon-cache.py stop) before deleting the files, but when I restart carbon (carbon-cache.py --debug start &) they just come back.

How do I permanently delete these files/metics so they never come back?

Community
  • 1
  • 1
Jeff
  • 8,020
  • 34
  • 99
  • 157

4 Answers4

19

By default, Statsd will continue to send 0 for counters it hasn't received in the previous flush period. This causes carbon to recreate the file.

Lets say we want to delete a counter called 'bad_metrics.sent' from Statsd. You can use the Statsd admin interface running on port 8126 by default:

$ telnet <server-ip> 8126
Trying <server-ip>...
Connected to <server-name>.
Escape character is '^]'.

Use 'help' to get a list of commands:

help
Commands: stats, counters, timers, gauges, delcounters, deltimers, delgauges, quit

You can use 'counters' to see a list of all counters:

counters
{ 'statsd.bad_lines_seen': 0,
  'statsd.packets_received': 0,
  'bad_metrics.sent': 0 }
END

Its the 'delcounters', 'deltimers', and 'delgauges' commands that remove metrics from statsd:

delcounters bad_metrics.sent
deleted: bad_metrics.sent
END

After removing the metric from Statsd, you can remove the whisper file associated with it. In this example case, that would be:

/opt/graphite/storage/whisper/bad_metrics/sent.wsp

or (in Ubuntu):

/var/lib/graphite/whisper/bad_metrics/sent.wsp
knocte
  • 16,941
  • 11
  • 79
  • 125
Dave Strock
  • 191
  • 1
  • 2
  • Is there any other way to pull up the statsd admin interface? Say from a root command prompt on the server? (The telnet interface isn't working on our server for some reason) – Jeff Jun 12 '14 at 14:20
  • Note: On my server it IS working, but when I telnet in it displays the word "ERROR". But it's actually doing OK! – Jeff Feb 04 '16 at 17:58
  • Note: for me deleting the counters wasn't sufficient. I restarted statsd and that seemed to fix the problem. – Jeff Feb 04 '16 at 18:36
  • Wow! This one had us scratching our heads for a while... but sure enough `statsd` was caching & sending old metric paths to Graphite for "phantom" EC2 instances which hadn't existed for months! Restarting `statsd` allowed us to clean up the whisper files on `carbon-cache` nodes, and this time they stayed gone. – TrinitronX Feb 27 '17 at 21:33
  • A little late to the party, but just found a very elegant way to interact with the admin interface: `echo 'counters' | nc localhost 8126 | grep api_server` Can also be used with the delete commands to delete specific data points. – Cognitiaclaeves Aug 03 '17 at 13:46
6

Are you running statsd or something similar?

I had the same issue and it was because statsd was flushing the counters it had in memory after I deleted the whisper files. I recycled statsd and the files stay deleted now.

Hope this helps

dk.
  • 937
  • 1
  • 10
  • 12
6

The newest StatsD version has an option to not send zeroes after flush anymore, but only what is actually sent to it. If you turn that one the whisper files shouldn't get recreated: https://github.com/etsy/statsd/blob/master/exampleConfig.js#L39

mrtazz
  • 181
  • 2
  • 2
    Setting this flag causes the graphs to have null values, which displays "unattached" data points. Therefore on each graph, you would have to enable the "Draw Null as Zero" setting to fix this. Do you happen to know of any other solutions? I'm encountering the same problem. – Adam May 21 '13 at 10:04
  • 2
    deleteIdleStats: true – Florin Andrei Aug 04 '15 at 20:31
  • There are several settings that apply: `deleteIdleStats`, which applies to all stats, and individually overridable `deleteGauges`, `deleteTimers`, `deleteSets`, and `deleteCounters`. These default to false, which means statsd will always send 0 (previous value for gauges) when data isn't received for a time bucket. – Crunch Sep 10 '15 at 18:02
1

We aren't running statsd, but we do run carbon-aggregator which serves a similar purpose. Restarting it solved a similar problem.

Mark Dominus
  • 1,726
  • 12
  • 38