1

The following syntax:

def func0(x: Int => Int, y: Int)(in: DataFrame): DataFrame = {
    in.filter('col > x(y))
}  

Cannot place the 'col. "col" does not work, whereas "col" does work in the following code fine:

def func1(x: Int)(in: DataFrame): DataFrame = {
    in.selectExpr("col", s"col + $x as col1")
}

The 'col signifies?

The dataframe example only has one col, col, what if 2 or 3 cols? Clearly missing something here. Something tells me it is very simple.

Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
thebluephantom
  • 16,458
  • 8
  • 40
  • 83

1 Answers1

2

'col is a way to refer to a column named col, same as $"col" or col("col") It's a bit confusing to have a column named col.

It's working for me in spark 2.3

EXAMPLE WITH A COLUMN NAMED NUMBER

df.show
+------+------+
|letter|number|
+------+------+
|     a|     1|
|     b|     2|
+------+------+

df.filter('number >1).show
+------+------+
|letter|number|
+------+------+
|     b|     2|
+------+------+

Same with the other expressions

import spark.implicits._
df.filter($"number" >1).show

import org.apache.spark.sql.functions
df.filter(col("number") >1).show
Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
SCouto
  • 7,808
  • 5
  • 32
  • 49