I am building an RDD from a text file. Some of the lines do not conform to the format I am expecting, in which case I use the marker -1.
def myParser(line):
try:
# do something
except:
return (-1, -1), -1
lines = sc.textFile('path_to_file')
pairs = lines.map(myParser)
is it possible to remove the lines with the -1
marker? If not, what would be the workaround for it?