2

I want to convert SCollection[String] to Seq[String] or List[String].

For example, I have a variable called ids.

val ids: SCollection[String] = ~
ids.saveAsTextFile(pathToGCS) 

When I save it to Cloud Storage, the contents of the text file are a table of IDs.

id1
id2
id2

I want to keep the contents of a file as Seq or List.

val seqOdIds: Seq[String] = ~

Tomislav Stankovic
  • 3,080
  • 17
  • 35
  • 42
user39613
  • 21
  • 2

1 Answers1

0

Not within the same job since Dataflow doesn't have the notion of a driver node like Spark to collect data from worker nodes. See https://spotify.github.io/scio/Scio%2C-Scalding-and-Spark.html#scio-and-spark

You can use the tap API to read file content after job completion. See https://spotify.github.io/scio/examples/TapOutputExample.scala.html

Neville Li
  • 420
  • 3
  • 10