Can I create sequence file in Spark?

Question

Currently we have an implementation in pig to generate sequence files from records where some of the attributes of a record are treated as key of sequence file and all the records corresponding to that key are stored in one sequence file. As we are moving to spark , i want to know how can this be done in spark ?

score 1 · Answer 1 · answered Jan 30 '17 at 12:56

saveAsSequnceFile save the data as a sequence file.

val a=sc.parallelize(List(1,2,3,4,5)).map(x=>(x,x*10)).saveAsSequenceFile("/saw1")

$ hadoop fs -cat /sqes/part-00000
SEQ org.apache.hadoop.io.IntWritable org.apache.hadoop.io.IntWritableZ      tTrh7��g�,��
2[cloudera@quickstart ~]$

to read the sequencefile use sc.sequenceFile

 val sw=sc.sequenceFile("/saw1/part-00000", classOf[IntWritable],classOf[IntWritable]).collect

Can I create sequence file in Spark?

1 Answers1