I am wondering if it is possible to have a custom Apache Beam Python version running in Google Dataflow. A version that is not available in the public repositories (as of this writing: 0.6.0 and 2.0.0). For example, the HEAD version from the official repository of Apache Beam, or a specific tag for that matter.
I am aware of the possibility of packing custom packages (private local ones for example) as described in the official documentation. There are answered are questions here on how to do this for some other scripts. And there is even a GIST guiding on this.
But I have not managed to get the current Apache Beam developing version (or a tagged one) that is available in the master branch of its official repository to get packaged and sent along my script to Google Dataflow.
For example, for the latest available tag, whose link for PiP to process would be: git+https://github.com/apache/beam.git@v2.1.0-RC2#egg=apache_beam[gcp]&subdirectory=sdks/python
I get something like this:
INFO:root:Executing command: ['.../bin/python', '-m', 'pip', 'install', '--download', '/var/folders/nw/m_035l9d7f1dvdbd7rr271tcqkj80c/T/tmpJhCkp8', 'apache-beam==2.1.0', '--no-binary', ':all:', '--no-deps']
DEPRECATION: pip install --download has been deprecated and will be removed in the future. Pip now has a download command that should be used instead.
Collecting apache-beam==2.1.0
Could not find a version that satisfies the requirement apache-beam==2.1.0 (from versions: 0.6.0, 2.0.0)
No matching distribution found for apache-beam==2.1.0
Any ideas? (I am wondering if it is even possible since Google Dataflow may have fixed the versions of Apache Beam that can run to the official released ones).