Lets say I am importing a flat file from HDFS into spark using something like the following:
val data = sc.textFile("hdfs://name_of_file.tsv").map(_.split('\t'))
This will produce an Array[Array[String]]
. If I wanted an array of tuples I could do as referenced in this solution and map the elements to a tuple.
val dataToTuple = data.map{ case Array(x,y) => (x,y) }
But what if my input data has say, 100 columns? Is there a way in scala using some sort of wildcard to say
val dataToTuple = data.map{ case Array(x,y, ... ) => (x,y, ...) }
without having to write out 100 variable to match on?
I tried doing something like
val dataToTuple = data.map{ case Array(_) => (_) }
but that didn't seem to make much sense.