0

I have a dataflow code in python 3.6 that works to copy data from pubsub topic into GCS bucket but when I create a template version of it with DataflowRunner I have this error:

Pip install failed for package: -r

 Output from execution of subprocess: b'Collecting apache-beam==2.27.0 (from -r ./requirements.txt (line 1))\r\n  File was already downloaded c:\\users\\kaghole\\appdata\\local\\temp\\dataflow-requirements-cache\\apache-beam-2.27.0.zip\r\nCollecting avro-python3!=1.9.2,<1.10.0,>=1.8.1 (from apache-beam==2.27.0->-r ./requirements.txt (line 1))\r\n  File was already downloaded c:\\users\\kaghole\\appdata\\local\\temp\\dataflow-requirements-cache\\avro-python3-1.9.2.1.tar.gz\r\nCollecting crcmod<2.0,>=1.7 (from apache-beam==2.27.0->-r ./requirements.txt (line 1))\r\n  File was already downloaded c:\\users\\kaghole\\appdata\\local\\temp\\dataflow-requirements-cache\\crcmod-1.7.tar.gz\r\nCollecting dill<0.3.2,>=0.3.1.1 (from apache-beam==2.27.0->-r ./requirements.txt (line 1))\r\n  File was already downloaded c:\\users\\kaghole\\appdata\\local\\temp\\dataflow-requirements-cache\\dill-0.3.1.1.tar.gz\r\nCollecting fastavro<2,>=0.21.4 (from apache-beam==2.27.0->-r ./requirements.txt (line 1))\r\n  File was already downloaded c:\\users\\kaghole\\appdata\\local\\temp\\dataflow-requirements-cache\\fastavro-1.2.3.tar.gz\r\nCollecting future<1.0.0,>=0.18.2 (from apache-beam==2.27.0->-r ./requirements.txt (line 1))\r\n  File was already downloaded c:\\users\\kaghole\\appdata\\local\\temp\\dataflow-requirements-cache\\future-0.18.2.tar.gz\r\nCollecting grpcio<2,>=1.29.0 (from apache-beam==2.27.0->-r ./requirements.txt (line 1))\r\n  File was already downloaded c:\\users\\kaghole\\appdata\\local\\temp\\dataflow-requirements-cache\\grpcio-1.34.0.tar.gz\r\nCollecting hdfs<3.0.0,>=2.1.0 (from apache-beam==2.27.0->-r ./requirements.txt (line 1))\r\n  File was already downloaded c:\\users\\kaghole\\appdata\\local\\temp\\dataflow-requirements-cache\\hdfs-2.5.8.tar.gz\r\nCollecting httplib2<0.18.0,>=0.8 (from apache-beam==2.27.0->-r ./requirements.txt (line 1))\r\n  File was already downloaded c:\\users\\kaghole\\appdata\\local\\temp\\dataflow-requirements-cache\\httplib2-0.17.4.tar.gz\r\nCollecting mock<3.0.0,>=1.0.1 (from apache-beam==2.27.0->-r ./requirements.txt (line 1))\r\n  File was already downloaded c:\\users\\kaghole\\appdata\\local\\temp\\dataflow-requirements-cache\\mock-2.0.0.tar.gz\r\nCollecting numpy<2,>=1.14.3 (from apache-beam==2.27.0->-r ./requirements.txt (line 1))\r\n  File was already downloaded c:\\users\\kaghole\\appdata\\local\\temp\\dataflow-requirements-cache\\numpy-1.19.5.zip\r\n  Installing build dependencies: started\r\n  Installing build dependencies: still running...\r\n  Installing build dependencies: finished with status \'error\'\r\n  Complete output from command C:\\Users\\kaghole\\retention_for_retention\\retentionenv\\Scripts\\python.exe C:\\Users\\kaghole\\retention_for_retention\\retentionenv\\lib\\site-packages\\pip-19.0.3-py3.6.egg\\pip install --ignore-installed --no-user --prefix C:\\Users\\kaghole\\AppData\\Local\\Temp\\pip-build-env-w2zl8_gl\\overlay --no-warn-script-location --no-binary :all: --only-binary :none: -i https://pypi.org/simple -- setuptools<49.2.0 wheel<=0.35.1 Cython>=0.29.21,<3.0:\r\n  Collecting setuptools<49.2.0\r\n    Using cached https://files.pythonhosted.org/packages/d0/4a/22ee76842d8ffc123d4fc48d24a623c1d206b99968fe3960039f1efc2cbc/setuptools-49.1.3.zip\r\n  Collecting wheel<=0.35.1\r\n    Using cached https://files.pythonhosted.org/packages/83/72/611c121b6bd15479cb62f1a425b2e3372e121b324228df28e64cc28b01c2/wheel-0.35.1.tar.gz\r\n  Collecting Cython<3.0,>=0.29.21\r\n    Using cached https://files.pythonhosted.org/packages/6c/9f/f501ba9d178aeb1f5bf7da1ad5619b207c90ac235d9859961c11829d0160/Cython-0.29.21.tar.gz\r\n  Installing collected packages: setuptools, wheel, Cython\r\n    Running setup.py install for setuptools: started\r\n      Running setup.py install for setuptools: finished with status \'done\'\r\n    Running setup.py install for wheel: started\r\n      Running setup.py install for wheel: finished with status \'done\'\r\n    Running setup.py install for Cython: started\r\n      Running setup.py install for Cython: finished with status \'error\'\r\n      Complete output from command C:\\Users\\kaghole\\retention_for_retention\\retentionenv\\Scripts\\python.exe -u -c "import setuptools, tokenize;__file__=\'C:\\\\Users\\\\kaghole\\\\AppData\\\\Local\\\\Temp\\\\pip-install-nn3jr_0n\\\\Cython\\\\setup.py\';f=getattr(tokenize, \'open\', open)(__file__);code=f.read().replace(\'\\r\\n\', \'\\n\');f.close();exec(compile(code, __file__, \'exec\'))" install --record C:\\Users\\kaghole\\AppData\\Local\\Temp\\pip-record-y1h5732j\\install-record.txt --single-version-externally-managed --prefix C:\\Users\\kaghole\\AppData\\Local\\Temp\\pip-build-env-w2zl8_gl\\overlay --compile --install-headers C:\\Users\\kaghole\\retention_for_retention\\retentionenv\\include\\site\\python3.6\\Cython:\r\n      Unable to find pgen, not compiling formal grammar.\r\n      running install\r\n      running build\r\n      running build_py\r\n      creating build\r\n      creating build\\lib.win-amd64-3.6\r\n      copying cython.py -> build\\lib.win-amd64-3.6\r\n      creating build\\lib.win-amd64-3.6\\Cython\r\n      copying Cython\\CodeWriter.py -> build\\lib.win-amd64-3.6\\Cython\r\n      copying Cython\\Coverage.py -> build\\lib.win-amd64-3.6\\Cython\r\n      copying Cython\\Debugging.py -> build\\lib.win-amd64-3.6\\Cython\r\n      copying Cython\\Shadow.py -> build\\lib.win-amd64-3.6\\Cython\r\n      copying Cython\\StringIOTree.py -> build\\lib.win-amd64-3.6\\Cython\r\n      copying Cython\\TestUtils.py -> build\\lib.win-amd64-3.6\\Cython\r\n      copying Cython\\Utils.py -> build\\lib.win-amd64-3.6\\Cython\r\n      copying Cython\\__init__.py -> build\\lib.win-amd64-3.6\\Cython\r\n      creating build\\lib.win-amd64-3.6\\Cython\\Build\r\n      copying Cython\\Build\\BuildExecutable.py -> build\\lib.win-amd64-3.6\\Cython\\Build\r\n      copying Cython\\Build\\Cythonize.py -> build\\lib.win-amd64-3.6\\Cython\\Build\r\n      copying Cython\\Build\\Dependencies.py -> build\\lib.win-amd64-3.6\\Cython\\Build\r\n      copying Cython\\Build\\Distutils.py -> build\\lib.win-amd64-3.6\\Cython\\Build\r\n      copying Cython\\Build\\Inline.py -> build\\lib.win-amd64-3.6\\Cython\\Build\r\n      copying Cython\\Build\\IpythonMagic.py -> build\\lib.win-amd64-3.6\\Cython\\Build\r\n      copying Cython\\Build\\__init__.py -> build\\lib.win-amd64-3.6\\Cython\\Build\r\n      creating build\\lib.win-amd64-3.6\\Cython\\Compiler\r\n      copying Cython\\Compiler\\AnalysedTreeTransforms.py -> build\\lib.win-amd64-3.6\\Cython\\Compiler\r\n      copying Cython\\Compiler\\Annotate.py -> build\\lib.win-amd64-3.6\\Cython\\Compiler\r\n      copying Cython\\Compiler\\AutoDocTransforms.py -> build\\lib.win-amd64-3.6\\Cython\\Compiler\r\n      copying Cython\\Compiler\\Buffer.py -> build\\lib.win-amd64-3.6\\Cython\\Compiler\r\n      copying Cython\\Compiler\\Builtin.py -> build\\lib.win-amd64-3.6\\Cython\\Compiler\r\n      copying Cython\\Compiler\\CmdLine.py -> build\\lib.win-amd64-3.6\\Cython\\Compiler\r\n      copying Cython\\Compiler\\Code.py -> build\\lib.win-amd64-3.6\\Cython\\Compiler\r\n      copying build_ext\r\n      building \'Cython.Plex.Scanners\' extension\r\n      error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": https://visualstudio.microsoft.com/downloads/\r\n  \r\n      ----------------------------------------\r\n  Command "C:\\Users\\kaghole\\retention_for_retention\\retentionenv\\Scripts\\python.exe -u -c "import setuptools, tokenize;__file__=\'C:\\\\Users\\\\kaghole\\\\AppData\\\\Local\\\\Temp\\\\pip-install-nn3jr_0n\\\\Cython\\\\setup.py\';f=getattr(tokenize, \'open\', open)(__file__);code=f.read().replace(\'\\r\\n\', \'\\n\');f.close();exec(compile(code, __file__, \'exec\'))" install --record C:\\Users\\kaghole\\AppData\\Local\\Temp\\pip-record-y1h5732j\\install-record.txt --single-version-externally-managed --prefix C:\\Users\\kaghole\\AppData\\Local\\Temp\\pip-build-env-w2zl8_gl\\overlay --compile --install-headers C:\\Users\\kaghole\\retention_for_retention\\retentionenv\\include\\site\\python3.6\\Cython" failed with error code 1 in C:\\Users\\kaghole\\AppData\\Local\\Temp\\pip-install-nn3jr_0n\\Cython\\\r\n  \r\n  ----------------------------------------\r\nCommand "C:\\Users\\kaghole\\retention_for_retention\\retentionenv\\Scripts\\python.exe C:\\Users\\kaghole\\retention_for_retention\\retentionenv\\lib\\site-packages\\pip-19.0.3-py3.6.egg\\pip install --ignore-installed --no-user --prefix C:\\Users\\kaghole\\AppData\\Local\\Temp\\pip-build-env-w2zl8_gl\\overlay --no-warn-script-location --no-binary :all: --only-binary :none: -i https://pypi.org/simple -- setuptools<49.2.0 wheel<=0.35.1 Cython>=0.29.21,<3.0" failed with error code 1 in None\r\n'

I am using below deployment command:

python -m df-pubsubRead-gcsWrite-Op --requirements_file requirements.txt --runner DataflowRunner --project ing-dev --staging_location gs://my_bucket/staging --temp_location gs://my_bucket/temp --template_location gs://my_bucket/templates/test/df-pubsubRead-gcsWrite-Op

The requirements.txt file:

apache-beam[gcp]==2.27.0

I tried:

Use setup.py per this Dataflow fails when I add requirements.txt [Python] but the setup_file argument is discarded:

WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: ['setup.py', 'True'] WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: ['setup.py', 'True']

Not including the requirements file, which successfully create the template but the flow fails because apache-beam is not installed. In other words, specifying dependencies is a must for me. Unless there are other ways to install dependencies on Dataflow.

Kaustubh Ghole
  • 537
  • 1
  • 10
  • 25
  • 2
    Is apache-beam your only package in requirements.txt? I'm surprised that your template fails without requirements.txt. Can you try with apache-beam[gcp] instead, since you're using Pub/Sub and GCS? https://beam.apache.org/get-started/quickstart-py/#download-and-install – Peter Kim Jan 12 '21 at 02:24
  • 1
    Your error is clear: `error: Microsoft Visual C++ 14.0 is required`. looks like it fails when trying to compile numpy – Travis Webb Jan 12 '21 at 06:16
  • @PeterKim I tried using apache-beam[gcp] in requirements.txt but still its a same error. – Kaustubh Ghole Jan 12 '21 at 08:16
  • @TravisWebb I install Microsoft Visual C++ 14.0 on my machine but again getting same error message. – Kaustubh Ghole Jan 12 '21 at 11:19
  • Try running your job from Cloud Shell. the issue has something to do with your local environment, and we don't have enough info to debug that. – Travis Webb Jan 16 '21 at 19:03
  • @TravisWebb Yes, I am already running my job from cloud shell, but its a same issue. – Kaustubh Ghole Jan 17 '21 at 17:18
  • there is no way that Cloud Shell is going to ask you to install Microsoft Visual C++. either you are using a module that is designed to only work on Windows, or you aren't actually running in Cloud Shell – Travis Webb Jan 17 '21 at 19:21

0 Answers0