How to Handle geolocated data using k-means cluster algorithm here, Can somebody please share your input here, Thanks in advance.
Project_2_Dataset.txt file entries look like this
=================================================
33.68947543 -117.5433083
37.43210889 -121.4850296
39.43789083 -120.9389785
39.36351868 -119.4003347
33.19135811 -116.4482426
33.83435437 -117.3300009
Please review my Code here:
============================
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.clustering.KMeans
val data = sc.textFile("Project_2_Dataset.txt")
val parsedData = data.map( line => Vectors.dense(line.split(',').map(_.toDouble)))
val kmmodel= KMeans.train(parsedData,3,5) --- 3 clusters,4 Iterations.
17/06/17 13:12:20 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 2)
java.lang.NumberFormatException: For input string: "33.68947543 -117.5433083"
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
at java.lang.Double.parseDouble(Double.java:538)
at scala.collection.immutable.StringLike$class.toDouble(StringLike.scala:232)
Thanks Amit K