I want to read a rather large csv file and process it (slice, dice, summarize etc.) interactively
(data exploration). My idea is to read the file into a database (H2) and use SQL to process it:
Read the file: I use Ostermiller csv parser
Determine the type of each column: I select randomly 50 rows and derive the type (int, long, double, date, string) of each column
I want to use Squeryl to process. To do so I need to create a case class dynamically. That's the bottleneck so far!
I upload the file to H2 and use any SQL command.
My questions:
- Is there a better general interactive way of doing this in Scala?
- Is there a way to solve the 3rd point? To state it differently, given a list of types (corresponding to the columns in the csv file), is it possible to dynamically create a case class corresponding to the table in Squeryl? To my understanding I can do that using macros, but I do not have enough exposure to do that.