0

This is my code:

val array = Array("Tom,9", "Amy,10")
val peopleRDD = sc.parallelize(array)
  .map(line => {
    People(line.split(",")(0), line.split(",")(1).toInt)
  })

import ssb.implicits._

val df = peopleRDD.toDF()
df.createOrReplaceTempView("people")

val ipMap = new mutable.HashMap[String, String]()

ssb.sql("""
    |select name from people
    |""".stripMargin)
  .foreach(x => {
    ipMap.put("aaa", x.apply(0).toString)
  })

for ((key, value) <- ipMap) {
  println("key is" + key + " ,value is" + value)
}   

Why is the map empty and no output? How can I assign values correctly?

blackbishop
  • 30,945
  • 11
  • 55
  • 76
datagic
  • 25
  • 7
  • 4
    A copy of `ipMap` is created and sent to executors. When you run `ipMap.put`, the copy of `ipMap` that is on the executor because it is a `foreach` running on a distributed object (RDD). When you run `println` the control is back in the driver program and that `ipMap` is still empty with no entries in it. – philantrovert Dec 20 '21 at 09:46
  • Thank you for your answer. You're right. I need to use accumulators or broadcast variables. – datagic Dec 20 '21 at 10:22

0 Answers0