I want to convert "pyspark.sql.dataframe.DataFrame" data to pandas. At the last line, "ConnectionRefusedError: [WinError 10061] Connection failed because the destination computer refused the connection" error occured. How can I fix it?
from pyspark import SparkConf, SparkContext
from pyspark.sql import SparkSession, Row
import pandas as pd
import numpy as np
import os
import sys
# spark setting
# local
conf = SparkConf().set("spark.driver.host", "127.0.0.1")
sc = SparkContext(conf=conf)
# session
spark = SparkSession.builder.master("local[1]").appName("test_name").getOrCreate()
# file
path = "./data/fhvhv_tripdata_2022-10.parquet"
# header가 있는 경우 option 추가
data = spark.read.option("header", True).parquet(path)
# Error ocurred
pd_df = data.toPandas()
I want to convert "pyspark.sql.dataframe.DataFrame" data to pandas.