6

I'm trying to integrate spark with prometheus. We have both spark 2 and spark 3. For spark 2 I know I can run jmx_exporter. Spark 3 has a new built in PrometheusServlet which is great. We are running spark on prem using YARN, not k8s.

My question is how do I dynamically discover prometheus scrape targets. As I understand it there's no static single central spark server to point to, instead each app gets packed into a yarn container and has its own metrics. Unless there is a way to aggregate these metrics (e.g. in spark history server) or have a static predictable address for each job?

When I submit a spark-streaming long running app, I'd like metrics for it to show up in Prometheus out of the box. I know the new PrometheusServlet has autodiscovery for k8s using annotations, I'd like to achieve something similar for yarn.

What I found so far:

  • I could make prometheus scrape pushgateway and when running spark-submit have my app send metrics there. I found a custom sink that does that. however pushgateway introduces its own problems, so was hoping to avoid it.
  • Use Prometheus file Service Discovery mechanism to add targets there. But how do I do that automatically without having to manually edit a json file every time I submit a new job? I found prometheus doesn't have an API to add targets and writing a job that would change a json file remotely when I run spark-submit feels kind of hacky.

Any suggestions for an elegant solution are welcome, thank you!

  • I am in the similar situation. Were you able to find alternative? I tried using the [custom sink](https://github.com/banzaicloud/spark-metrics/blob/master/PrometheusSink.md) but it shows only driver metrics. I am struggling to get executor metrics. – it243 Nov 20 '21 at 02:38
  • Unfortunately I haven't pushed this any further. If we really need spark metrics in prometheus, we'll probably just use a statsd sink and then have prometheus talk to statsd. – Jarek Cellary Nov 21 '21 at 12:28
  • Would also like to use PrometheusServlet with Spark on YARN, but I doubt it is currently possible. The original proposal [SPARK-29032](https://issues.apache.org/jira/browse/SPARK-29032) only mentions PrometheusServlet support for Spark standalone environment or K8s environment. It looks like there is no PrometheusServlet support for Mesos or YARN. – Fierr Nov 24 '21 at 16:10
  • Hey, we're in the same situation (except that we run Spark on AWS EMR, where it uses yarn too). As there is no support for PrometheusServlet in this setting we plan to implement our own sink that will push metrics to push gateway. – michal sankot Dec 01 '21 at 13:37

0 Answers0