0

I am fetching few values from DB and want to create a nested map data structure out of this. The tabular data looks like this

+---------+--------------+----------------+------------------+----------------+-----------------------+
| Cube_ID | Dimension_ID | Dimension_Name | Partition_Column | Display_name   | Dimension_Description |
+---------+--------------+----------------+------------------+----------------+-----------------------+
|       1 |            1 | Reporting_Date | Y                | Reporting_Date | Reporting_Date        |
|       1 |            2 | Platform       | N                | Platform       | Platform              |
|       1 |            3 | Country        | N                | Country        | Country               |
|       1 |            4 | OS_Version     | N                | OS_Version     | OS_Version            |
|       1 |            5 | Device_Version | N                | Device_Version | Device_Version        |
+---------+--------------+----------------+------------------+----------------+-----------------------+

I want to create a nested structure something like this

{
    CubeID = "1": {
        Dimension ID = "1": [
            {
                "Name": "Reporting_Date",
                "Partition_Column": "Y"
                "Display": "Reporting_Date"
            }
        ]
        Dimension ID = "2": [
            {
                "Name": "Platform",
                "Column": "N"
                "Display": "Platform"
            }
        ]
    },
    CubeID = "2": {
        Dimension ID = "1": [
            {
                "Name": "Reporting_Date",
                "Partition_Column": "Y"
                "Display": "Reporting_Date"
            }
        ]
        Dimension ID = "2": [
            {
                "Name": "Platform",
                "Column": "N"
                "Display": "Platform"
            }
        ]
    }
}        

I have the result set from DB using the following. I am able to populate individual columns, but not sure how to create a map for later computation

while (rs.next()) {
      val Dimension_ID = rs.getInt("Dimension_ID")
      val Dimension_Name = rs.getString("Dimension_Name")
      val Partition_Column = rs.getString("Partition_Column")
      val Display_name = rs.getString("Display_name")
      val Dimension_Description = rs.getString("Dimension_Description")
}

I believe I should write a case class for this, but I am not sure how to create a case class and load values to the case class.

Thanks for the help. I can provide any other info needed. Let me know

CSUNNY
  • 414
  • 1
  • 7
  • 23
  • What is the type of all the variables on your last snippet? What do you mean you are not sure how to create the case class? - PS: It looks you are using a plain **JDBC**, have you taken a look to scala frameworks like [**Doobie**](https://github.com/tpolecat/doobie) or [**Slick**](http://slick.lightbend.com/)? – Luis Miguel Mejía Suárez Jul 29 '19 at 01:58
  • Thanks for the reply.Most variables in the table are string. I was not sure which data structure would be the best fit. I prefer not using lists as the columns will be unnamed. I have not tried these frameworks yet. – CSUNNY Jul 29 '19 at 04:25
  • If you want named things the best option are nested case classes. Using List for grouped values. – Luis Miguel Mejía Suárez Jul 29 '19 at 04:40

1 Answers1

2

Background

you can define data class something as below,

case class Dimension(
    dimensionId: Long,
    name: String,
    partitionColumn: String,
    display: String
)

case class Record(
    cubeId: Int,
    dimension: Dimension
)

case class Data(records: List[Record])

And this is how you can construct data,

val data =
  Data(
    List(
      Record(
        cubeId = 1,
        dimension = Dimension(
          dimensionId = 1,
          name = "Reporting_Date",
          partitionColumn = "Y",
          display = "Reporting_Date"
        )
      ),
      Record(
        cubeId = 2,
        dimension = Dimension(
          dimensionId = 1,
          name = "Platform",
          partitionColumn = "N",
          display = "Platform"
        )
      )
    )
  )

Now to your question, since you are using JDBC you have to construct list of records in a mutable way or use scala Iterator. I will write below mutable way to construct above data class but you can explore more.

import scala.collection.mutable.ListBuffer
var mutableData = new ListBuffer[Record]()

while (rs.next()) {
  mutableData += Record(
    cubeId = rs.getIn("Cube_ID"),
    dimension = Dimension(
      dimensionId = rs.getInt("Dimension_ID"),
      name = rs.getString("Dimension_Name"),
      partitionColumn = rs.getString("Partition_Column"),
      display = rs.getString("Dimension_Description")
    )
  )
}

val data = Data(records = mutableData.toList)

Also read - Any better way to convert SQL ResultSet to Scala List

prayagupa
  • 30,204
  • 14
  • 155
  • 192
  • Thanks for the reply. I will try this and get back to you in case of any doubts – CSUNNY Jul 29 '19 at 04:28
  • Thanks. I could use this. But, I am now stuck and have an embarrassing question. I am able to create the record group, but not sure how to construct the data group. – CSUNNY Jul 29 '19 at 17:28
  • I am getting an error like : Type mismatch: expected data actual ListBuffer[Record] – CSUNNY Jul 29 '19 at 17:57
  • I made some changes and it is working. Case class Data(records: ListBuffer[Record]) and val data: DimensionData= DimensionData(mutableData). Let me know if you see any issues here – CSUNNY Jul 29 '19 at 18:11
  • ahh, thats because you have to convert `mutable.ListBuffer` to `List`, so if you simply do `mutableData.toList` should work. immutable is preferred instead of mutable. So your code will look like `val data: DimensionData= DimensionData(mutableData.toList)` – prayagupa Jul 29 '19 at 19:01