I have a jsonl file I've read in, created a temporary table view and filtered down the records that I want to ammend.
val df = session.read.json("tiny.jsonl")
df.createOrReplaceTempView("tempTable")
val filter = df.select("*").where("field IS NOT NULL")
Now I am at the part where I have been trying various things. I want to change a column called "time" with the currentTimestamp before I write it back. Sometimes I will want to change the currentTimestamp to be timestampNow - 5 days for example.
val change = test.withColumn("server_time", date_add(current_timestamp(), -1))
The example above will throw me back a date that's 1 from today, rather than a timestamp.
Edit: Sample Dataframe that mocks out my jsonl input:
val df = Seq(
(1, "fn", "2018-02-18T22:18:28.645Z"),
(2, "fu", "2018-02-18T22:18:28.645Z"),
(3, null, "2018-02-18T22:18:28.645Z")
).toDF("id", "field", "time")
Expected output:
+---+------+-------------------------+
| id|field |time |
+---+------+-------------------------+
| 1| fn | 2018-04-09T22:18:28.645Z|
| 2| fn | 2018-04-09T22:18:28.645Z|
+---+------+-------------------------+