31

We have tried wrapping the column name with brackets [column name], single & double quotes, and backticks, none of them works.

Does Spark SQL support columns whose name contains spaces?

blackbishop
  • 30,945
  • 11
  • 55
  • 76
DarkZero
  • 2,259
  • 3
  • 25
  • 36

2 Answers2

61

Backticks seem to work just fine:

scala> val df = sc.parallelize(Seq(("a", 1))).toDF("foo bar", "x")
df: org.apache.spark.sql.DataFrame = [foo bar: string, x: int]
scala> df.registerTempTable("df")

scala> sqlContext.sql("""SELECT `foo bar` FROM df""").show
foo bar
a  

Same as DataFrame API:

scala> df.select($"foo bar").show
foo bar
a   

So it looks like it is supported, although I doubt it is recommended.

zero323
  • 322,348
  • 103
  • 959
  • 935
  • In case if column is hierarchical, then you can select like """ select parent.`foo bar` ...""" – spats Apr 09 '19 at 04:12
  • Thank you It worked. backticks (`) will do the work. – MukeshKoshyM Jun 26 '20 at 19:54
  • 1) What spark version do you use> Looks like old one. 2) With `sc.parallelize` it works, but when I tried `spark.read.schema(schema).parquet("D:\\path")` I got error even when I specify the schema with spaces. – Cherry Oct 06 '20 at 18:02
26

Instead of using Brackets like in T-SQL [column name] Use backticks to wrap the column name `column name`. This is when you run SQL. You also use Backticks in spark SQL to wrap the column name but use triple quotes as answered by zero323.

SaadK
  • 1,507
  • 20
  • 33