-2

I'm confused why a type that implements comparable isn't "implicitly comparable", and also why certain syntaxes of sortWith won't compile at all:

// The iter is a collection of records, each event_time is a java.sql.Timestamp

// A. Works but won't sort eq millis
val records = iter.toArray.sortWith(_.event_time.getTime < _.event_time.getTime)

// B. Doesn't compile, implements comparable yet not implicitly comparable?
val records = iter.toArray.sortBy(r => (r.event_time))

// C. Doesn't compile, complains about the 2nd _ though it's like the first line
val records = iter.toArray.sortWith(_.event_time.before(_.event_time))

// D. Doesn't compile, also complains though it's more like the first line
val records = iter.toArray.sortWith(_.event_time.compareTo(_.event_time) < 0)

// E. Works, sorts by epoch millis, then nanos after the milli.
val records = iter.toArray.sortBy(r =>
    (r.event_time.getTime, r.event_time.getNanos))
// But this leaves questions about the above uses of sortWith which don't work

Given that the first line works, using two underscores, I don't understand why the compiler complains about the other uses of sortsWith. Is there some syntax which could use the before method happily? And using timestamp's comparable status somehow with line B?

Maybe I should note I'm using Scala 2.11.12 with JDK 8, because it's for a specific version of Spark on a deployment of AWS EMR.

I can see that with Scala 2.13.6 line B compiles and works, however with 2.11.12 I'm getting:

No implicit Ordering defined for java.sql.Timestamp.
not enough arguments for method sortBy: (implicit ord: scala.math.Ordering[java.sql.Timestamp])List[Playground.Foo].
Unspecified value parameter ord.

Captured error from 2.11.12 I'm unclear how to satisfy what it's requesting.

Then the error line C gives me is:

missing parameter type for expanded function ((x$5) => x$5.event_time.before(((x$6) => x$6.event_time)))

While line D is similar with:

missing parameter type for expanded function ((x$7) => x$7.event_time.compareTo(((x$8) => x$8.event_time)).$less(0))

As noted by an answer below, C and D can be accomplished with another syntax:

val records = iter.toArray.sortWith{case(a,b)=>
              a.event_time.before(b.event_time)} // C
val records = iter.toArray.sortWith{case(a,b)=>
              a.event_time.compareTo(b.event_time) < 0} // D

Which has gotten me to fixing all the ways shown except line B in 2.11.12 (see https://scastie.scala-lang.org/xAwV3FheQWu8n7OoaZaQGA)

dlamblin
  • 43,965
  • 20
  • 101
  • 140

2 Answers2

2
// Works but won't sort eq millis
val records = iter.toArray.sortWith(_.event_time.getTime < _.event_time.getTime)

This is because Timestamp.getTime does not include millis.

// Doesn't compile, implements comparable yet not implicitly comparable?
val records = iter.toArray.sortBy(r => (r.event_time))

Not sure why you think it does not work. Works fine for me ... (BTW, you got too many ugly parentheses there. This is better written as sortBy(_.event_time).

// Doesn't compile, complains about the 2nd _ though it's like the first line
val records = iter.toArray.sortWith(_.event_time.before(_.event_time))

Yes, these underscore placeholders only work at the immediate scope level. This expression above translates to something like

   .sortWith(x => x.event_time.before(y => y.event_time))

which is, of course, invalid. See here for the description of how underscores are interpreted within context. When in doubt, just declare the variable explicitly:

.sortWith { case(a,b) => a.event_time.before(b.event_time) }
// Doesn't compile, also complains though it's more like the first line
val records = iter.toArray.sortWith(_.event_time.compareTo(_.event_time) < 0)

Same reason/same fix as above

Dima
  • 39,570
  • 6
  • 44
  • 70
  • I guess I should have said, due to working with a specific EMR and spark build in using Scala 2.11. With jdk 8, maybe a later update resolved some of these. Thanks for the case(a,b) suggestion fix. – dlamblin Jun 13 '21 at 00:55
0

Providing an implementation of compare by extending Ordering for the type is one work around (in Scala 2.11.12) for the issue with line B:

import java.sql.Timestamp

case class Record(event_time: Timestamp) extends Ordered[Record] {
  def compare(that: Record) = this.event_time.compareTo(that.event_time)
}

def correct(records: List[Record]) = (
  if ((records.head compare records.last) < 0) " C" else " Inc"
) + "orrectly ordered"

val now1 = new Timestamp(System.currentTimeMillis())
now1.setNanos(999000001)
val now2 = new Timestamp(now1.getTime())
now2.setNanos(999999999)
val records_in:List[Record] = new Record(now2) :: new Record(now1) :: Nil

// A. Works but won't sort eq millis
val records_outA = records_in.sortWith(_.event_time.getTime < _.event_time.getTime)
println("A: " + records_outA + correct(records_outA))

// B. Doesn't compile, implements comparable yet not implicitly comparable?
//val records_outB = records_in.sortBy(r => (r.event_time))
// B. Works by relying on Record class's definition of Orderable
val records_outB = records_in.sorted
println("B: " + records_outB + correct(records_outB))

// C. Doesn't compile, complains about the 2nd _ though it's like the first line
//val records_outC = records_in.sortWith(_.event_time.before(_.event_time))
// C. Works by changing argument to use explicitly named parameters
val records_outC = records_in.sortWith((a,b)=>a.event_time.before(b.event_time))
println("C: " + records_outC + correct(records_outC))

// D. Doesn't compile, also complains though it's more like the first line
//val records_outD = records_in.sortWith(_.event_time.compareTo(_.event_time) < 0)
// D. Works by also changing argument to something else without implicit parameters
val records_outD = records_in.sortWith{case(a,b)=>a.event_time.compareTo(b.event_time) < 0}
println("D: " + records_outD + correct(records_outD))

// E. Works, sorts by epoch millis, then nanos after the milli.
val records_outE = records_in.sortBy(r =>
    (r.event_time.getTime, r.event_time.getNanos))
println("E: " + records_outE + correct(records_outE))

The above produces:

A: List(Record(2021-06-14 23:19:12.999999999), Record(2021-06-14 23:19:12.999000001)) Incorrectly ordered
B: List(Record(2021-06-14 23:19:12.999000001), Record(2021-06-14 23:19:12.999999999)) Correctly ordered
C: List(Record(2021-06-14 23:19:12.999000001), Record(2021-06-14 23:19:12.999999999)) Correctly ordered
D: List(Record(2021-06-14 23:19:12.999000001), Record(2021-06-14 23:19:12.999999999)) Correctly ordered
E: List(Record(2021-06-14 23:19:12.999000001), Record(2021-06-14 23:19:12.999999999)) Correctly ordered
dlamblin
  • 43,965
  • 20
  • 101
  • 140