0

I want to create Map[Int,Set[String]] in scala by reading input from a CSV file.

My file.csv is,

sunny,hot,high,FALSE,no
sunny,hot,high,TRUE,no
overcast,hot,high,FALSE,yes
rainy,mild,high,FALSE,yes
rainy,cool,normal,FALSE,yes
rainy,cool,normal,TRUE,no
overcast,cool,normal,TRUE,yes

I want the output as,

var Attributes = Map[Int,Set[String]] = Map()

Attributes += (0 -> Set("sunny","overcast","rainy"))
Attributes += (1 -> Set("hot","mild","cool"))
Attributes += (2 -> Set("high","normal"))
Attributes += (3 -> Set("false","true"))
Attributes += (4 -> Set("yes","no"))

This 0,1,2,3,4 represents the column number and Set contains the distinct values in each column.

I want to add each (Int -> Set(String)) to my attribute "Attributes". ie, If we print Attributes.size , it displays 5(In this case).

rosy
  • 146
  • 2
  • 14
  • Look at [http://stackoverflow.com/questions/1284423/read-entire-file-in-scala](http://stackoverflow.com/questions/1284423/read-entire-file-in-scala) to see how to read the lines into memory. To avoid performance issues consider using `Streams`. `zipWithIndex` will give you the line number with each line. Iterate over the lines and create your `Map`. Hope this helps! – benji Dec 05 '14 at 04:01

1 Answers1

2

Use one of the existing answers to read in the CSV file. You'll have a two dimensional array or vector of strings. Then build your map.

// row vectors
val rows = io.Source.fromFile("file.csv").getLines.map(_.split(",")).toVector
// column vectors
val cols = rows.transpose
// convert each vector to a set
val sets = cols.map(_.toSet)
// convert vector of sets to map
val attr = sets.zipWithIndex.map(_.swap).toMap

The last line is bit ugly because there is no direct .toMap method. You could also write

val attr = Vector.tabulate(sets.size)(i => (i, sets(i))).toMap

Or you could do the last two steps in one go:

val attr = cols.zipWithIndex.map { case (xs, i) => 
  (i, xs.toSet) 
} (collection.breakOut): Map[Int,Set[String]]
0__
  • 66,707
  • 21
  • 171
  • 266
  • If I use another dataset, it show an error, ie, Exception in thread "main" java.lang.IllegalArgumentException: transpose requires all collections have the same size – rosy Dec 06 '14 at 05:04