0

how to traversing an Array in a spark RDD?

var dataResult: Array[Array[String]] = null

data1 = hiveContext.sql("select id, lon, lat  from table1").rdd.map(
  row=>(row.getAs[String]("id"), row.getAs[Double]("lon"), row.getAs[Double]("lat"))
).map(
  u=>Array(u._1,u._2.toString,u._3.toString)
).toArray()

val data2= hiveContext.sql("select id2,lon,lat from  table2").rdd.map(
      row=>(row.getAs[String]("id2"), row.getAs[Double]("lon"), row.getAs[Double]("lat"))
    )

var data3 = data2.map(u=>{
        for (i <- (0 until data1.length-1)){
          if(u._2 + 0.2 >= data1(i)(1).toDouble && u._2 - 0.2 <= data1(i)(1).toDouble && u._3 + 0.2 >= data1(i)(2).toDouble && u._3 - 0.2 <= data1(i)(2).toDouble){
          dataResult ++= ArrayBuffer(Array(u._1,u._2.toString,u._3.toString,data1(i)(0).toString,data1(i)(1).toString,data1(i)(2).toString))
          }
        }
        dataResult.toArray
    })

the result the code are

 Array[String] = Array([Ljava.lang.String;@c0ae5d5, [Ljava.lang.String;@c0ae5d5......

however I want

Array[String]=Array(Array(uid1,3,4),Array(uid2,4,5).....)
koiralo
  • 22,594
  • 6
  • 51
  • 72
  • 1
    Possible duplicate of [Scala - printing arrays](https://stackoverflow.com/questions/3328085/scala-printing-arrays) – user10938362 May 21 '19 at 13:54
  • The type of what you want appears to be `Array[Array[String]]` – emran May 21 '19 at 15:46
  • I tried foreach to repalce for loop,however the result are same。the println method is not what I what. is there a way to acess the value of array to calculate in RDD – wangjiweisean May 22 '19 at 13:30

0 Answers0