this would just be carbon-cache.py. Is this correct?
Yes.
carbon-relay.py serves two distinct purposes: replication and
sharding. carbon-aggregator.py can be run in front of carbon-cache.py
to buffer metrics
Yes. Though in practice, aggregation of metrics at the source (statsd, collectd, diamond) is better than carrying them on till the very end of the stack and then aggregating them. Though, the aggregator supports more complex multiple-metric aggregation rules.
If so can I remove there sections from the carbon.conf file?
Yes you can.
Also I do not need the storage-aggregation.conf file?
You don't.
What is port 7002 for and do I need it open for my simple
installation. I have read its for the "carbon-cache query port" but I
dont understand this and cant find any more details on this.
Yes you need to leave it as is. The default query port for carbon-cache is 7002. This is where graphite-web queries carbon to ask for the metric information during rendering.
Edit-
i presumed you were looking at a bare minimal setup. For more complex metrics, it is advisable to have a storage-aggregation setup. It is a good idea to have xFilesFactor
set to 0 so that even very little rates of metric input aren't ignored.
Then, it makes semantic sense if you sum counters and average timers while aggregation.
[counters_fall_here]
pattern = ^(Facebook\.counters)\.(production)
xFilesFactor = 0.0
aggregationMethod = sum
[timers_fall_here]
pattern = .*
xFilesFactor= 0.0
aggregationMethod = average
Aggregation can be explained as thus-
Say you pluck apples daily and the no_of_apples
and time_to_pluck
are stored in a notebook. When your garden owner expects a monthly 'report', you aggregate data by averaging the times and by adding the counts.
This similar approach comes in practice when the storage schema shifts in granularity. For instance, if your schema is 10s:1d,60s:7d
then aggregation happens in the 10s to 60s
interval wherein granularity shifts. The 6 data-points in the 1d period need to be stored as a single data-point in the 7d period. How they're aggregated is defined by aggregationMethod
.