0
avgsalary_df = spark.read.csv('/content/drive/MyDrive/BigData2021/Lecture23/datasets/data_scientist_salaries.csv', header = True)
avgsalary_df = df.selectExpr('Job Title' ,'Location', 'salary', 'spark')
avgsalary_df.show()

Here is my code but it wont return Job Title column cause of space inside name.what is incorrect?

sasaii
  • 49
  • 1
  • 7
  • By the way , just using `df.select(...)` instead of `selectExpr` would work fine. However, I'd recommend you to rename that column, avoid having spaces or special characters in column names in general. – blackbishop Jan 09 '22 at 09:43

1 Answers1

2

You need to quote column names with backticks (`).

avgsalary_df = df.select(['`Job Title`', 'Location', 'salary', 'spark'])
过过招
  • 3,722
  • 2
  • 4
  • 11
  • try but I have invalid syntax error here.. – sasaii Jan 09 '22 at 09:05
  • 1
    Actually there is no need to use backticks with dataframe API only when using SQL. `df.select(*['Job Title', 'Location', 'salary', 'spark'])` would work as well. The OP got that error because they used `selectExpr` not select. – blackbishop Jan 09 '22 at 09:39