I am new in apache spark. I create the schema and data frame and it show me result but the format was not good and it so messy. Hardly I can read the line. So i want to show my result in pandas format. I attached the screen shot of my data frame result. But i don't know how to show my result in pandas format.
Here's my code
from pyspark.sql.types import StructType, StructField, IntegerType
from pyspark.sql.types import *
from IPython.display import display
import pandas as pd
import gzip
schema = StructType([StructField("crimeid", StringType(), True),
StructField("Month", StringType(), True),
StructField("Reported_by", StringType(), True),
StructField("Falls_within", StringType(), True),
StructField("Longitude", FloatType(), True),
StructField("Latitue", FloatType(), True),
StructField("Location", StringType(), True),
StructField("LSOA_code", StringType(), True),
StructField("LSOA_name", StringType(), True),
StructField("Crime_type", StringType(), True),
StructField("Outcome_type", StringType(), True),
])
df = spark.read.csv("crimes.gz",header=False,schema=schema)
df.printSchema()
PATH = "crimes.gz"
csvfile = spark.read.format("csv")\
.option("header", "false")\
.schema(schema)\
.load(PATH)
df1 =csvfile.show()
it shows the result like below
but in want this data pandas form
Thanks