I expect the following code to output "b" and null, since both don't equal the string "a". However, spark only outputs "b". To have the null in the output, I have to explicitly include $"word".isNull in the filter
val df = Seq(("a"),("b"),(null)).toDF("word")
df.filter($"word".notEqual("a")).show()
output:
+----+
|word|
+----+
| b|
+----+
What am I missing about how Spark dataframe treats nulls?