My scenario is a variation on the one discussed here: How do I write to BigQuery using a schema computed during Dataflow execution?
In this case, the goal is that same (read a schema during execution, then write a table with that schema to BigQuery), but I want to accomplish it within a single pipeline.
For example, I'd like to write a CSV file to BigQuery and avoid fetching the file twice (once to read schema, once to read data).
Is this possible? If so, what's the best approach?
My current best guess is to read the schema into a PCollection via a side output and then use that to create the table (with a custom PTransform) before passing the data to BigQueryIO.Write.