We have a csv file called survey.csv and we need to load it into an rdd.
We tried this:
rdd_test = survey_results.csv.map(lambda x: (x, 1))
it doesn't work. Anyone can help?
We have a csv file called survey.csv and we need to load it into an rdd.
We tried this:
rdd_test = survey_results.csv.map(lambda x: (x, 1))
it doesn't work. Anyone can help?
SparkContext.textFile
creates an RDD
import sys
from pyspark import SparkContext
# create Spark context
sc = SparkContext()
# read input text file to RDD
lines = sc.textFile("./survey.csv")