Pandas API function: https://spark.apache.org/docs/latest/api/python/reference/pyspark.pandas/api/pyspark.pandas.DataFrame.to_string.html
Another answer, though it doesn't work for me pyspark : Convert DataFrame to RDD[string]
Following above post advice, I tried going with
data.rdd.map(lambda row: [str(c) for c in row])
Then I get this error
TypeError: 'PipelinedRDD' object is not iterable
I would like for it to output rows of strings as if it's similar to to_string()
above. Is this possible?