I am learing spark datasets and checking how can we convert an rdd to a dataset.
For this, i got the following code:
val spark = SparkSession
.builder
.appName("SparkSQL")
.master("local[*]")
.getOrCreate()
val lines = spark.sparkContext.textFile("../myfile.csv")
val structuredData = lines.map(mapperToConvertToStructureData)
import spark.implicits._
val someDataset = structuredData.toDS
Here if we want to convert an rdd to dataset, we need import spark.implicits._ just before the conversion.
Why is this written just before the conversion? Can we use this import as regular imports as we do on the top of the file?