4

We have a number of Scala classes returning Map[String,String] (key,value) results for storage in a NoSQL database. Some of the results are actually Map[String, List] or Map[String, ArrayBuffer], so we are using .toString on those objects to convert. This gives us output that looks like:

"ArrayBuffer(1,2,3,4)"

or

"List(1,2,4)"

Rather than include the object type, we'd like to have these written out as straight CSV, quotes and escaped characters as necessary. Is there a good CSV library that works well with Scala?

Daniel C. Sobral
  • 295,120
  • 86
  • 501
  • 681
Joshua
  • 5,336
  • 1
  • 28
  • 42

3 Answers3

2

If you just want to serialize a single list or array without correctly escaping quotes etc:

scala> List(1,2,3,4).mkString(",")
res39: String = 1,2,3,4

If you are looking to serialize slightly more complex data structures: product-collections will serialize a collection of tuples or case classes (any Product) to csv and correctly escape the quotes.

scala> List ((1,"Jan"),
     | (2,"Feb"),
     | (3,"Mar","Extra column")).csvIterator.mkString("\n")
res41: String =
1,"Jan"
2,"Feb"
3,"Mar","Extra column"

product-collections will also write directly to a java.io.Writer. It has a collection CollSeq specialized on homogenous tuples that would not allow the "extra column" above.

And to convert your original data into a format that can be handled by product-collections:

scala> CollSeq(scala.collection.mutable.ArrayBuffer("a","b","quoted \"stuff\""):_*)
res52: com.github.marklister.collections.immutable.CollSeq1[String] =
CollSeq((a),
        (b),
        (quoted "stuff"))

scala> res52.csvIterator.mkString("\n")
res53: String =
"a"
"b"
"quoted ""stuff"""
Mark Lister
  • 1,103
  • 6
  • 16
1

Here's a similar question, which ought to cover the CSV part. I have looked over Scala questions about csv, and there doesn't seem to be any suggestion of something that will create a csv, instead of just parsing it. So I'd look into the Java libraries.

Community
  • 1
  • 1
Daniel C. Sobral
  • 295,120
  • 86
  • 501
  • 681
0

You can do something like this, if you also want headers for the CSV file, and don't mind having to extend the trait with your own row mapping function. This trait here also includes the opposite part of loading your CSV back, also requiring your own supplied mapping function.

trait CsvSerialization[T] {

  import java.io.File
  import helpers.FileIO._

  def fileName: String
  def basePath: String
  def csvColumnHeaders: Array[String]
  def itemToRowMapper(item: T): List[String]
  def rowToItemMapper(row: Map[String, String]): T

  val filePath = basePath + File.separator + fileName

  def cached = new File(filePath).exists

  def toCsv(collection: Iterable[T]) = {
    println(basePath)
    makePathRecursive(basePath) // you can skip this in your code
    val writer = com.github.tototoshi.csv.CSVWriter.open(new File(filePath), append = false)      
    writer.writeRow(csvColumnHeaders.toList)   
    writer.writeAll(collection map  { collectionItem =>
      val csvRowSerialized = itemToRowMapper(collectionItem)
      require (
        csvRowSerialized.length == csvColumnHeaders.length,
        s"csv row mapping function returned ${csvRowSerialized.size} items whereas column headers has ${csvColumnHeaders.size} items")
      csvRowSerialized
    } toSeq)
    writer.close
  }

  def fromCsv: List[T] = {
    val reader = com.github.tototoshi.csv.CSVReader.open(new File(filePath))
    val collection = reader.allWithHeaders map { csvRow => 
      require (
        csvRow.size == csvColumnHeaders.length,
        s"csv row mapping function returned ${csvRow.size} items whereas column headers length has ${csvColumnHeaders.size} items")
      rowToItemMapper(csvRow)
    }
    reader.close
    collection
  }

}

For this code, need to include this underlying CSV library in your project: https://github.com/tototoshi/scala-csv

Note that extending traits can get thorny in Scala, you may choose a different wrapper for this piece.

matanster
  • 15,072
  • 19
  • 88
  • 167