1

I have a data frame with columns

  • patient_id,
  • DOB,
  • Gender,
  • marital_status,
  • smoking_status,
  • city

I need to extract age from the column DOB and add a new column age to my data frame. How can I proceed using Scala?

ChrisGPT was on strike
  • 127,765
  • 105
  • 273
  • 257
Alice
  • 165
  • 2
  • 4
  • 13
  • 2
    please provide sample input data, expected output dataframe and your tried code. – Ramesh Maharjan Sep 05 '17 at 09:26
  • 1
    Yes and add the output of printSchema, just to know if it's a string, a java.sql.Date or a timestamp, thx – tricky Sep 05 '17 at 09:28
  • Maybe this can help you : [Link about UDF solution](https://stackoverflow.com/questions/32484068/convert-date-of-birth-into-age-in-spark-dataframe-api) – tricky Sep 05 '17 at 09:29

1 Answers1

1
val df = sqlcontext.sql(" SELECT *, DATEDIFF(hour,DOB,"+GETDATE()+")/8766 AS AgeYearsIntTrunc")
toofrellik
  • 1,277
  • 4
  • 15
  • 39
  • Why it is divided by 8766? Can you please explain? – Alice Sep 05 '17 at 09:48
  • It assumes 8766 hours per year, which works out to 365.25 days. Since there are no years with 365.25 days(.25 is because of leap year), this will be incorrect near the person's birth date more often than it is correct. if needed accurately then refer [this](https://stackoverflow.com/questions/57599/how-to-calculate-age-in-t-sql-with-years-months-and-days) – toofrellik Sep 05 '17 at 10:01
  • when I run this query i am getting an error: not found:value GETDATE – Alice Sep 07 '17 at 09:03