I have a question about reading and creating a dataset. I have a text file which contains:
Sunny,Hot,High,Weak,No
Sunny,Hot,High,Strong,No
and I implemented this code like this:
from pyspark import SparkConf, SparkContext
import operator
import math
conf = SparkConf().setMaster("local[*]").setAppName("Lab 6")
sc = SparkContext(conf=conf)
rawData = sc.textFile("txtfile.data")
data = rawData.flatMap(lambda line: line.split(","))
instead of having a result like this:
[(Sunny, Hot, High, Weak, No), (Sunny, Hot, High, Strong, No)]
It gave me the result:
['Sunny', 'Hot', 'High', 'Weak', 'No', 'Sunny', 'Hot', 'High', 'Strong', 'No']
Can anyone show me how to fix this?