1

I am getting the following error when running an Apache Beam pipeline. The full error code is:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-12-870f9c2f41e5> in <module>
     39                  file_path_prefix=os.path.join(OUTPUT_DIR, 'ptp-dataset.csv'))))
     40 
---> 41 preprocess()

<ipython-input-12-870f9c2f41e5> in preprocess()
     22       'requirements_file': 'requirements.txt'
     23     }
---> 24     opts = beam.options.pipeline_options.PipelineOptions(flags=[], **options)
     25     RUNNER = 'DataflowRunner' # 'DirectRunner'
     26 

AttributeError: module 'apache_beam' has no attribute 'options'

The code generating the error is when I try to call the PipelineOptions class.

 opts = beam.pipeline.PipelineOptions(flags=[], **options)
 RUNNER = 'DataflowRunner' # 'DirectRunner'
Ekaba Bisong
  • 2,918
  • 2
  • 23
  • 38

1 Answers1

0

To resolve this issue, pip install the latest version of apache-beam by running:

pip install apache-beam[gcp]

Restart your kernel and then import the class using options.pipeline_options.PipelineOptions. In this example, change it to:

opts = beam.options.pipeline_options.PipelineOptions(flags=[], **options)
RUNNER = 'DataflowRunner'
Ekaba Bisong
  • 2,918
  • 2
  • 23
  • 38