I started to learn Scala on spark and try to do ETL. I try to filter data frames whose string should be split into 4 columns by whitespace family.
This is what I have tried
df.where(split(df("item"), "\\s+").length == 4).show()
And it shows the error value length is not a member of org.apache.spark.sql.Column
and I have looked the documentation and find that it returns the Column class so it definitely doesn't have the length attribute.
And I am stuck on it and doesn't know how to solve it, I have googled it but only find this.
What I want is to filter based on the split length and I also try to lookup what split return but I don't know which split function is used.
So could you please tell me
1.What the split function it used?
2.How to filter rows with split string's length is 4?
Thanks