1

I am new to scala and wish to debug this piece of code to see why I do not get any results.

>  def main(args:Array[String]){
>     Logger.getLogger("org").setLevel(Level.ERROR)
>     val sc = new SparkContext("local[*]","WordCountRe")
>     val input = sc.textFile("data/book.txt")
>     //With regexp
>     val words = input.flatMap(x=>x.split("\\W+"))
>     //Lower case
>     val lowerCaseWords = words.map(x => x.toLowerCase())
>     val wordCounts = lowerCaseWords.map(x => (x,1)).reduceByKey((x,y)=>x+y)
>     val sortedWordCounts = wordCounts.sortBy(-_._2)
>     val commonEnglishStopWords = List("you","to","your","the","a","of","and","that","it","in","is","for","on","are","if","s","i","with","t","this","or","but","they","will","what","at","my","re","do","not","about","more","an","up","need","them","from","how","there","out","new","work","so","just","don","","get","their","by","some","ll","self","make","may","even","when","one","than","also","much","job","who","was","these","find","into","only")
>     val filteredWordCounts = sortedWordCounts.filter{
>       x =>
>         val inspectVariable = commonEnglishStopWords.contains(x._1)} //Error here
>     filteredWordCounts.collect().foreach(println)   } }

When I try to use this code, I get a compile error :

type mismatch; found : Unit required: Boolean WordCountRe.scala /SparkScalaCourse/src/com/sundogsoftware/spark line 29 Scala Problem

This thread How to find data inside a rdd seems to have the solution I tried to apply, except I must be using it wrong.

Thank you for your help

EDIT : Found what was wrong with my code (needed to put a ._1 in the contains in order to parse the word in the tuple (word, count)), but I still don't know how to debug/inspect values in such a situation.

Imad
  • 2,358
  • 5
  • 26
  • 55

1 Answers1

2

The problem is that you assigned boolean result of method contains to a val inspectVariable. This operation has a return type of Unit. But filter method requires boolean.

Just remove val inspectVariable = and this should fix it.

Or return the value by adding new line with content inspectVariable after assigning the value.

As shown here

val filteredWordCounts = sortedWordCounts.filter { x =>
  val inspectVariable = commonEnglishStopWords.contains(x._1)//put your breakpoint here
  inspectVariable
}
Ivan Stanislavciuc
  • 7,140
  • 15
  • 18