1

Don´t have access to var in loop. Why? thanks.

   var grupo = "A"
   for (a <- dataframe) {
        grupo = "B"
    }
    println(grupo) //Print A

Edit!

dataframe filterP show like this:

|   CODIGO|LISTA|NUMERO|OPCION|NUMERO|OP|VALOR|
+---------+-----+------+------+------+--+-----+
|110111001|    P|  0000|     A|  0000| 1|    1|
|110111001|    P|  0000|     A|  0000| 1|    1|
|110111001|    P|  0000|     A|  0000| 1|    2|
|110111001|    P|  0000|     A|  0000| 1|    3|
|110111001|    P|  0000|     A|  0000| 1|    1|
|110111001|    P|  0000|     B|  0000| 1|    2|

Code:

var grupo = List(filterP.first()(3).toString())
   var grupo_tmp = grupo(0)
   println("first group:" + grupo(0))
   for (a <- filterP) {
       if(grupo_tmp != a(3).toString()){
         println(grupo_tmp + "|" + a(3).toString())
         grupo = a(3).toString() :: grupo
         grupo_tmp = a(3).toString()
       }
    }
    println(grupo_tmp)
    println("Grupos de lista "+grupo.length)
    for(i <- 0 to grupo.length-1){ 
      println("grupo: "+ grupo(i))
    }

This print:

   first group:A
   A|B
   A
   Grupos de lista 1
   grupo: A

I don`t see where is the problem..

J.M. P.R.
  • 21
  • 3
  • If `dataframe` is some distributed context such as spark, always remeber that code may executed on another machine. So never use var, use other api instead – jilen Jun 05 '17 at 10:23

2 Answers2

2

It definitely has access to var inside loop.

  • Maybe you have a separate variable inside a loop named grupo or
  • maybe you loop is not meeting the conditions at all, meaning dataframe is empty.
  • try printing a inside your loop to debug.

see the example,

scala> var grupo = "A"
grupo: String = A

scala> for (a <- Array("MUTATE-1", "MUTATE-2")) { grupo = a }

scala> grupo
res6: String = MUTATE-2
prayagupa
  • 30,204
  • 14
  • 155
  • 192
  • 1
    Your example works. I do not understand what may be happening. I'm going to edit the question with the original code. Thank you very much – J.M. P.R. Jun 05 '17 at 09:58
  • @J.M.P.R. can you post the value of `dataframe`? Trying having `println(a)` inside loop as well – prayagupa Jun 05 '17 at 10:03
  • so `dataframe` is need to be mapped to collected - see example https://stackoverflow.com/a/33031744/432903 – prayagupa Jun 05 '17 at 20:48
1

I am wondering how your for loop is working without you collecting the filterP dataframe. for loop is just transformation and println should not have executed without an action on the dataframe.

Dataframes are partitioned by dafault and I am guessing is that your dataframe is partitioned and you are getting only partial output.

collecting dataframe to the driver should solve the issue

var grupo = List(filterP.first()(3).toString())
var grupo_tmp = grupo(0)
println("first group:" + grupo(0))
for (a <- filterP.collect) { //collect the dataframe to the driver
  if(grupo_tmp != a(3).toString()){
    println(grupo_tmp + "|" + a(3).toString())
    grupo = a(3).toString() :: grupo
    grupo_tmp = a(3).toString()
  }
}
println(grupo_tmp)
println("Grupos de lista "+grupo.length)
for(i <- 0 to grupo.length-1){
  println("grupo: "+ grupo(i))
}

I get the output as

first group:A
A|B
B
Grupos de lista 2
grupo: B
grupo: A
Ramesh Maharjan
  • 41,071
  • 6
  • 69
  • 97