I have large compressed(.zip) files around 10 GB each. I need to read content of file inside zip without unzipping it and want to apply transformations.
System.setProperty("HADOOP_USER_NAME", user)
println("Creating SparkConf")
val conf = new SparkConf().setAppName("DFS Read Write Test")
println("Creating SparkContext")
val sc = new SparkContext(conf)
var textFile = sc.textFile(filePath)
println("Count...."+textFile.count())
var df = textFile.map(some code)
` When i passing a any .txt,.log,.md etc.. above is working fine. But when pass .zip files the it giving Zero Count.
- Why it is giving count Zero
- Please suggest me correct way of doing this, If am totally wrong.