1

I've set up a Dataflow pipeline that ingests messages from Pub/Sub, converts to a dict and prints the messages.

Here is the script I've written:

import apache_beam as beam
import logging
import message_pb2
from apache_beam.options.pipeline_options import StandardOptions
from google.protobuf.json_format import MessageToDict

TOPIC_PATH = "projects/<TOPIC ID>/topics/<TOPIC NAME>"

def protoToDict(msg, schema_class):
    message = schema_class()

    if isinstance(msg, (str, bytes)):
        message.ParseFromString(msg)
    else:
        return "Invalid Message - something isn't quite right."

    return MessageToDict(message, preserving_proto_field_name=True)

pipelineOptions = beam.options.pipeline_options.PipelineOptions()
pipelineOptions.view_as(StandardOptions).streaming = True

pipeline = beam.Pipeline(options=pipelineOptions)

data = (
    pipeline
    | 'Read from PubSub' >> beam.io.ReadFromPubSub(topic=TOPIC_PATH)
    | 'Proto to Dict' >> beam.Map(lambda pb_msg: protoToDict(pb_msg, message_pb2.Message))
    | 'Log Result' >> beam.Map(lambda msg: logging.info(msg))
)

pipeline.run()

When I run this with:

python -m <script name> --region=europe-west2 --runner=DataflowRunner --project=<PROJECT ID> --worker-machine-type=n1-standard-3

I receive this error:

creation failed: The zone 'projects/<PROJECT ID>/zones/europe-west2-b' does not have enough resources available to fulfill the request. Try a different zone, or try again later.

I've seen various other sources however the suggestions given are along the lines of "try a different machine type until it works" or "wait until there are more resources".

Surely I'm doing something wrong and it's not on Google's side?

Joe Moore
  • 2,031
  • 2
  • 8
  • 29
  • It's on google side, but you can bypass it by changing the machine type like you saw and/or changing zone. – arthurq Aug 07 '23 at 13:44
  • Oh.. is there a way to list available machine types or reserve a machine type for x days etc? – Joe Moore Aug 07 '23 at 13:45
  • 1
    Does this answer your question? [the zone does not have enough resources available to fulfill the request/ the resource is not ready](https://stackoverflow.com/questions/52684656/the-zone-does-not-have-enough-resources-available-to-fulfill-the-request-the-re) – doneforaiur Aug 09 '23 at 07:37

3 Answers3

1

Currently I believe there is no way to look for available resource for this.

The goal is to make sure that there are available resources in all zones. Deploying and balancing your workload across multiple zones or regions to reduce the likelihood of an outage.

Reference: https://cloud.google.com/architecture/scalable-and-resilient-apps

SO ref: Imprevisibility on Google Cloud Compute Engine resource availability per zone

The general suggestion/workaround for this is to try different zone of the resource involved.

Nestor Ceniza Jr
  • 976
  • 3
  • 11
1

The machine type n1-standard-3 does not exist, that's why the command fails. You can see what N1 machines are available by default at https://cloud.google.com/compute/docs/general-purpose-machines#n1_machines

Normally the number of vCPUs (the "3" in "n1-standard-3") is a multiple of 2.

You can also list the machine types available in a specific zone with command like the following:

gcloud compute machine-types list --filter="zone:( us-central1-b europe-west1-d europe-west2-b )"
Israel Herraiz
  • 611
  • 3
  • 8
0

After testing the pipeline on us-central1 I confirmed that this was a Google server issue.

The solution I came to was to use reservations. Although they do incur cost, if you reserve a VM for, say 3 years you will be entitled to a discount.


Please bear in mind the protoToDict function will not work in the code above. See the apache-beam structure for more details if you find yourself in the same position.

Joe Moore
  • 2,031
  • 2
  • 8
  • 29