0

I am trying to deploy a simple pipeline on Google Cloud Dataflow (to copy some data from PubSub to Bigtable), but I keep getting the following error:

Exception in thread "main"  
java.lang.StackOverflowError
        at java.util.HashMap.hash(HashMap.java:338)
        at java.util.HashMap.get(HashMap.java:556)
        at org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:67)
        at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:358)
[ multiple times ... ]
        at org.apache.log4j.Category.<init>(Category.java:57)
        at org.apache.log4j.Logger.<init>(Logger.java:37)
        at org.apache.log4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:43)
        at org.apache.log4j.LogManager.getLogger(LogManager.java:45)
        at org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:73)
        at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:358)
java failed with exit status 1

This error kills the worker. This happens even though I have no logging statements or imports in my code, and indeed none of my code is referenced in the stack trace. I am familiar with this question, and I can see (in GCP Stackdriver) that my Java command does indeed contain log4j_to_slf4j.jar:

java -Xmx5834483752 -XX:-OmitStackTraceInFastThrow -Xloggc:/var/log/dataflow/jvm-gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=512K -cp /opt/google/dataflow/streaming/libWindmillServer.jar:/opt/google/dataflow/streaming/dataflow-worker.jar:/opt/google/dataflow/slf4j/jcl_over_slf4j.jar:/opt/google/dataflow/slf4j/log4j_over_slf4j.jar:/opt/google/dataflow/slf4j/log4j_to_slf4j.jar: ...

The problem is this Java command is created by Google. log4j_to_slf4j.jar is not among my own dependencies. How do I edit this command and remove it from the Classpath? Or is there a better solution?

Thanks!

stf
  • 591
  • 1
  • 4
  • 19

1 Answers1

0

Ok so downgrading these dependencies:

<dependency>
   <groupId>org.apache.beam</groupId>
   <artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
   <version>2.12.0</version>
</dependency>
<dependency>
   <groupId>org.apache.beam</groupId>
   <artifactId>beam-sdks-java-extensions-json-jackson</artifactId>
   <version>2.12.0</version>
</dependency>

from 2.12.0 to 2.9.0 has resolved the issue. It fails again with 2.10.0. I was modeling my implementation on the official (?) examples which, however, use 2.4.0.

stf
  • 591
  • 1
  • 4
  • 19
  • I have filed a bug: https://github.com/GoogleCloudPlatform/cloud-bigtable-examples/issues/325 . – stf May 16 '19 at 15:07