2

I am trying to truncate the value in the column and also take the minimum of it, if the condition is not equal to 1 then it should place 20 in it , I have tried to use math.trunc for truncation. but there is a truncation error

Traceback (most recent call last): File "", line 6, in AttributeError: trunc

This is what I have tried:

lspf_ret.groupBy(col("lsbrnm"),col("lsdlp"),col("lsdlr")).agg({'lsdte':'min'}).select(
    lspf_ret.lsbrnm,
    lspf_ret.lsdlp,
    lspf_ret.lsdlr,
    datediff(lit('2019-02-28 00:00:00').cast(TimestampType()),
    concat(when( lit(math.trunc(col("min(lsdte)")/1000000)) ==1,"20").otherwise(''),
           when(math.trunc(((col("min(lsdte)")/1000000)==0,"19").otherwise("")),
           right(left(col("min(lsdte)").cast(StringType()),3),2),
    lit('-'),
    left(right(col("min(lsdte)").cast(StringType()),4),2),
    lit('-'),
    right(col("min(lsdte)").cast(StringType()),2),
    lit("00:00:00")
    ).cast(TimestampType()))))

Here is the input:

+------+-----+-------------+----------+
|lsbrnm|lsdlp|        lsdlr|min(lsdte)|
+------+-----+-------------+----------+
|  0135|  HP2|1061129000003|   1120929|
|  2266|  EF4| 160301212861|   1180224|
|  2266|  EF4| 170901365509|   1190225|
|  2266|  EF4| 180201399924|   1190225|
|  2266|  EF4| 180401421697|   1190201|
|  2266|  EF4| 131201027045|   1171130|
|  2266|  EF4| 140901072492|   1170301|
|  2266|  EF4|   1812M03566|   1190225|
user1584253
  • 975
  • 2
  • 18
  • 55
  • can you not map to a UDF? – Chris Apr 10 '19 at 14:35
  • 3
    Please give us a [reproducible](https://stackoverflow.com/questions/48427185/how-to-make-good-reproducible-apache-spark-examples) example of your data and your expected output. – cronoik Apr 10 '19 at 14:37

0 Answers0