1

We have a csv file called survey.csv and we need to load it into an rdd.

We tried this:

rdd_test = survey_results.csv.map(lambda x: (x, 1)) 

it doesn't work. Anyone can help?

Adriaan
  • 17,741
  • 7
  • 42
  • 75
Kirsten
  • 11
  • 1
  • Welcome to Stack Overflow! Please take the [tour] and read up on [ask], as well as [mcve]. [edit]ing the question with a sample of your CSV file (only a few rows and columns please) and elaborating on what doesn't work (is there an error, wrong/no data, something else?) would help us help you. – Adriaan May 19 '22 at 12:27

1 Answers1

0

SparkContext.textFile creates an RDD

import sys

from pyspark import SparkContext
 
# create Spark context
sc = SparkContext()
 
# read input text file to RDD
lines = sc.textFile("./survey.csv")

Source

Helpful SO post

EoinS
  • 5,405
  • 1
  • 19
  • 32