2

For the following table structure:

id externalId name version
1  10         n1   1
2  65         n2   2
3  10         n3   2
4  77         n4   1 

I'm trying to get all the entries (all columns) that has a max version grouped by externalId. The expected result should be:

id externalId name version
2  65         n2   2
3  10         n3   2
4  77         n4   1 

For this purpose, I have the following slick query defined:

val resulting = myTableEntries
 .groupBy(x => x.externalID)
 .map { 
   case (id, group) => 
     (id, group.map(_.version).max) 
 }

How can I get all the columns out instead of just the id and the version?

Effectively what I need is a Slick version of the following SQL:

select myTable.id, myTable.name, myTable.externalId, myTable.version
    from MyTable myTable
    where version = 
        (select max(revision) from MyTable myTable1 where myTable.id=myTable1.id)
joesan
  • 13,963
  • 27
  • 95
  • 232

2 Answers2

2

Based on this example and this answer, I think you want the following:

val maxVersionByExternalID = myTableEntries
 .groupBy(x => x.externalID)
 .map { 
   case (externalID, group) => 
     (externalID, group.map(_.version).max)
//We have the pair of ids and max versions _not yet run_ so we can join
val resulting = (myTableEntries join maxVersionByExternalID on (
    (entries, maxPair) => //join condition
       entries.externalID === maxPair._1 && entries.version === maxPair._2))
  .map{case (entries, maxPair) => entries}//keep only the original
  .run //materialize at the end
Community
  • 1
  • 1
Gábor Bakos
  • 8,982
  • 52
  • 35
  • 52
  • What is that resulting doing?? – joesan May 29 '15 at 15:19
  • I am not sure I understand your question. That is intended to be the same as you have written in the original question, that would be the query result you were looking for. The only difference, here I have `.run` it too. Maybe your intention was a further join? – Gábor Bakos May 29 '15 at 15:21
  • I'm using Slick 2.1.0 and my IDE complains that entries and maxPair won't resolve – joesan May 29 '15 at 15:24
  • 1
    Sorry, I missed two parentheses around the join. Hope this way it will work. – Gábor Bakos May 29 '15 at 15:32
  • It is certainly a non-trivial tool. Was the answer work for your problem? Did I misunderstand your question? Or is there still some problem? (I have not tried to compile.) – Gábor Bakos May 29 '15 at 16:24
  • I will try that out and perhaps read a bit more about Joins! But whatever it might be, you have definitely helped me! I thank you very very much! – joesan May 29 '15 at 16:32
0

I have not worked on Slick, but looking at the scala code and your requirement, the below mentioned question comes to my mind.

The other way to solve your problem is using Rank function in SQL, give rank to every record within group, in your case externalID and rank over version in your case.

Once you are done with that you can select all the records with rank as 1

Please refer the below question on how to solve such cases.

Spark two level aggregation

Community
  • 1
  • 1
Akash
  • 355
  • 4
  • 11