2

I have a test script to read a text file provided as a parameter like below:

test.scala:

$ cat test.scala
import scala.io.Source

val filename = args(0)
for (line <- Source.fromFile(filename).getLines) {
    println(line)
}

I want to read a text file below:

$ cat test.txt
test1
test2
test3

I need to run the scala in commandline like below:

spark-shell -i test.scala test.txt

I expect test.txt is recognized as args(0), but I see the output like:

:26: error: not found: value args val filename = args(0)

Can anyone enlighten me on what is the right usage to do this? Thank you very much.

UPDATE:

cat test.scala
import scala.io.Source

val args = spark.sqlContext.getConf("spark.driver.args").split(",")
val filename = args(0)

for (line <- Source.fromFile(filename).getLines) {
    println(line)
}

Test result: spark-shell -i test.scala --conf spark.driver.args="test.txt"

 SQL context available as sqlContext. Loading test.scala... import
 scala.io.Source <console>:26: error: not found: value spark
          val args = spark.sqlContext.getConf("spark.driver.args").split(",")
Andrey Tyukin
  • 43,673
  • 4
  • 57
  • 93
mdivk
  • 3,545
  • 8
  • 53
  • 91

2 Answers2

0

You can pass your custom --conf argument value to spark. Here is the way how you can pass your arguments:

import scala.io.Source

val args = spark.sqlContext.getConf("spark.driver.args").split(",")
val arg1 = args(0)
val arg2 = arg(1)
print(arg1)

In --conf, I've to pass value of spark.driver.args arguments. So the final command to run the script will be :

spark-shell -i test.scala --conf spark.driver.args="param1value,param2value,param3value"
Md Shihab Uddin
  • 541
  • 5
  • 13
  • Thank you very much. would this work for spark 1.6? UPDATE made to the question. – mdivk Feb 06 '19 at 21:16
  • It works for spark 2.3. `SparkSession` has been introduced from `spark-2.0`, So you'll get `spark not found` error if you use less than spark version 2.0. So first initialize `SqlContext` obj and it should work then. I would suggest to update your spark env to get more features and better performance. – Md Shihab Uddin Feb 07 '19 at 12:00
0

This works for me

import scala.io.Source

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

val args = sqlContext.getConf("spark.driver.args").split(",")

println(args)

args
Vijay Anand Pandian
  • 1,027
  • 11
  • 23