1

I´m trying to create a Dataframe with any sequences and for obtain this sequences, I have the folow implementation.

I have a initial Dataframe and I iterate on it with a foreach. In condition of the values of the diferents columns of this Dataframe, i modify a empty Seq with the values i have. This Sequence is initialize empty below of the loop.

At end of the loop, the sequence is empty again.why?

var seq1 =  Seq.empty[Long]  

df1.foreach(r=>{
      if ( r(2) < CountAw1p) {
        if (r(1) == "T-F") {
          countTF = countTF + r(2)
        } else if (r(1) == "T-D") {
          countTD = countTD + r(2)
        }
      } else {
        var rest = CountAw1p - (countTF + countTD)
        rest = r(2) - rest
        if (r(1) == "T-F") {
          countTF = countTF + rest
          seq1 = seq1 ++ Seq(countTF, countTD) //seq is full
        } else if (r(1) == "T-D") {
          countTD = countTD + rest
          seq1 = seq1 ++ Seq(countTF, countTD)//seq is full
        }
      }
    })

    val completInfo = Seq(seq1).toDF() //seq is empty
    return completInfo
  • you can do df1.collect().foreach ... and this will fix the issue, but computation will not be distributed. The problem is that currently your "seq1" var is filled on different nodes, but you are trying to use this var on driver program. Each node has its own copy of seq1 – Bogdan Vakulenko Oct 04 '18 at 11:51
  • Its works. Thank´s a lot! – ierrandonea Oct 04 '18 at 13:35

0 Answers0