6

I am running a streaming Apache beam pipeline in Google Dataflow. It's reading data from Kafka and streaming insert to Bigquery.

But in the bigquery streaming insert step it's throwing a large number of warning -

    java.lang.RuntimeException: ManagedChannel allocation site
at io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.<init> (ManagedChannelOrphanWrapper.java:93)
at io.grpc.internal.ManagedChannelOrphanWrapper.<init> (ManagedChannelOrphanWrapper.java:53)
at io.grpc.internal.ManagedChannelOrphanWrapper.<init> (ManagedChannelOrphanWrapper.java:44)
at io.grpc.internal.ManagedChannelImplBuilder.build (ManagedChannelImplBuilder.java:612)
at io.grpc.internal.AbstractManagedChannelImplBuilder.build (AbstractManagedChannelImplBuilder.java:261)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.createSingleChannel (InstantiatingGrpcChannelProvider.java:340)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.access$1600 (InstantiatingGrpcChannelProvider.java:73)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider$1.createSingleChannel (InstantiatingGrpcChannelProvider.java:214)
at com.google.api.gax.grpc.ChannelPool.create (ChannelPool.java:72)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.createChannel (InstantiatingGrpcChannelProvider.java:221)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.getTransportChannel (InstantiatingGrpcChannelProvider.java:204)
at com.google.api.gax.rpc.ClientContext.create (ClientContext.java:169)
at com.google.cloud.bigquery.storage.v1beta2.stub.GrpcBigQueryWriteStub.create (GrpcBigQueryWriteStub.java:138)
at com.google.cloud.bigquery.storage.v1beta2.stub.BigQueryWriteStubSettings.createStub (BigQueryWriteStubSettings.java:145)
at com.google.cloud.bigquery.storage.v1beta2.BigQueryWriteClient.<init> (BigQueryWriteClient.java:128)
at com.google.cloud.bigquery.storage.v1beta2.BigQueryWriteClient.create (BigQueryWriteClient.java:109)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl.newBigQueryWriteClient (BigQueryServicesImpl.java:1255)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl.access$800 (BigQueryServicesImpl.java:135)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.<init> (BigQueryServicesImpl.java:521)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.<init> (BigQueryServicesImpl.java:449)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl.getDatasetService (BigQueryServicesImpl.java:169)
at org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite.flushRows (BatchedStreamingWrite.java:374)
at org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite.access$800 (BatchedStreamingWrite.java:69)
at org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite$BatchAndInsertElements.finishBundle (BatchedStreamingWrite.java:271)
at org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite$BatchAndInsertElements$DoFnInvoker.invokeFinishBundle (Unknown Source)
at org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.finishBundle (SimpleDoFnRunner.java:242)
at org.apache.beam.runners.dataflow.worker.SimpleParDoFn.finishBundle (SimpleParDoFn.java:432)
at org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.finish (ParDoOperation.java:56)
at org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute (MapTaskExecutor.java:103)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process (StreamingDataflowWorker.java:1430)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1100 (StreamingDataflowWorker.java:165)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$7.run (StreamingDataflowWorker.java:1109)
at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:624)
at java.lang.Thread.run (Thread.java:748)

I am using apache beam java sdk 2.29.0.

Any clue what is causing the issue?

Pablo
  • 10,425
  • 1
  • 44
  • 67
  • This seems likely to be a bug in either Beam or one of its dependencies, and does not seem like a user error. Here's a [similar bug report](https://github.com/googleapis/google-cloud-java/issues/3648) on github, for example. Reporting this as a bug on the [Apache Beam Jira](https://issues.apache.org/jira/browse/BEAM) may be helpful, especially if you have instructions for replicating the issue. – Daniel Oliveira Jun 05 '21 at 00:35

3 Answers3

6

I had the same problem with an Apache Beam pipeline reading from Pub/Sub and streaming to BigQuery. I was able to "solve" it by downgrading to version 2.28.0 of the Apache Beam java SDK. It seems like the problem was introduced in version 2.29.0 of the SDK and is still present in 2.30.0.

  • 4
    This solved the issue for us as well. For reference, [this ticket](https://issues.apache.org/jira/browse/BEAM-12356) indicates the issue will be fixed in version 2.31.0. – donutsftw Jun 16 '21 at 14:10
0

A fix for this problem was just merged in to Beam, and should hopefully be released in the next Beam version.

Reuven Lax
  • 771
  • 5
  • 9
0

I had the same issue. I'm using batch(also tried streaming) write into BigQuery. I "resolved" it by downgrading to 2.28.0

Any version equal or greater than 2.29.0 produced the same error. Although my pipeline still can write data into the BigQuery.

leo
  • 15
  • 6