-1

I have read file using scala and for the missing values I have to print as "Missing". I have used case/match/Some/Option to handle it. I end up with IndexOutOfBound exception. I have used try catch as well in the code but no luck. Any help will be appreciated ??

I am reading a file and the file has some missing values I have to update it with value as "MISSING".

    package HW9

    object WeatherStub {
      //Assigning file name        
      val fileName = "weather.csv"

      def main(args: Array[String]): Unit = {
        //Calling the read Weather method
        readWeather(fileName)

      }

    //Method to handle missing and exception
      def readWeather(fn: String): Unit = {
        var weatherMuteMap = scala.collection.mutable.Map[String, String]()
        def IsEmptyOrNull(s:String): Option[String] =   {try {Some(s.toString)} catch {case _ => None}}

    //Reading files
        for(line <- io.Source.fromFile(fn).getLines()){
          val list1 = line.split(",").map(_.trim).toList

    //Handling missing values
          val TotalPrecp = IsEmptyOrNull(list1(1).toString) match { case Some(i) => i case _ => "Missing"}
          val LowPrecp = IsEmptyOrNull(list1(2).toString) match { case Some(i) => i   case _ => "Missing" }
          val HighPrecp = IsEmptyOrNull(list1(3).toString) match { case Some(i) => i   case _ => "Missing" }

//Concatenating values to a map
          weatherMuteMap(list1(0)) = TotalPrecp + LowPrecp + HighPrecp
//Print    
          println(weatherMuteMap)

        }
      }

      }

Sample Data from file:-

    2016-01-01,0,-13.28,-1.11
    2016-01-02,0,-10,0
    2016-01-03,0,-10,0
    2016-01-04,0,-12.78,-2.22
    2016-01-06,0,-6.11,0.61
    2016-01-07,0.05,-0.61,1
    2016-01-08,0.1,,1
    2016-01-09,0.13,-5.61,0
    2016-01-21,0,,
    2016-01-22,0,,
    2016-01-23,,-9.39,-6.11
    2016-02-19,0,,
    2016-02-20,0,0,0
    2016-02-21,,,
    2016-02-22,0,-0.61,0.61
    2016-02-23,,,

    Error:-

    Exception in thread "main" java.lang.IndexOutOfBoundsException: 2
        at scala.collection.LinearSeqOptimized$class.apply(LinearSeqOptimized.scala:65)
        at scala.collection.immutable.List.apply(List.scala:84)
        at HW9.WeatherStub$$anonfun$readWeather$1.apply(WeatherStub.scala:19)
        at HW9.WeatherStub$$anonfun$readWeather$1.apply(WeatherStub.scala:16)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at HW9.WeatherStub$.readWeather(WeatherStub.scala:16)
        at HW9.WeatherStub$.main(WeatherStub.scala:8)
        at HW9.WeatherStub.main(WeatherStub.scala)

    Process finished with exit code 1
Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
Issaq
  • 41
  • 1
  • 7

1 Answers1

1

If you run some of the examples of your data:

scala> val str = "2016-02-23,,,"
scala> str.split(",").map(_.trim).toList
res0: List[String] = List(2016-02-23)

You can see that you are only getting the first value in the list, hence the IndexOutOfBoundsException error. You can turn off this behaviour like this:

scala> str.split(",", -1).map(_.trim).toList
res1: List[String] = List(2016-02-23, "", "", "")

Have a look at this thread that explains why: Java String split removed empty values

EDIT:

There are two things in your code that don't make sense, the definition of IsEmptyOrNull takes a String so that would never fail in the try block, therefore it'd never return None. Also line.split would always return an Array[String], so the elements are already String. You could think, how empty and missing values are represented in the data, and replace IsEmptyOrNull with something like this:

def IsEmptyOrNull(s: String): Option[String] = {
  s match {
    case "" => None
    case _ => Some(s)
  }
}
Community
  • 1
  • 1
jamborta
  • 5,130
  • 6
  • 35
  • 55
  • That is a good catch. Thanks!!. I am able to fix that inbound error exception but still the value "Missing" is not being printed at all. Not sure why?? – Issaq Apr 09 '17 at 23:24