Error While fetching columns from join condition in pyspark

Asked Feb 06 '21 at 09:56

Active Feb 06 '21 at 19:03

Viewed 70 times

I am having 2 csv files i want to load the csv files into data frames in pyspark while joining 2 files I am not having any issues but while retrieving results I am facing an error please help me this

Deliveries Csv having 21 columns and matches csv having 18 columns

My code looks like below

df1=spark.read.csv(r"C:\deliveries.csv",header=True,inferSchema=True)
df2=spark.read.csv(r"C:\matches.csv",header=True,inferSchema=True)
df  = df1.join(df2, df1.match_id == df2.id, how='inner')
df.show(10)

I am getting following error Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.

So I tried to increase the size by using following command in pyspark

spark.conf.set("spark.sql.debug.maxToStringFields", 1000)

still same issue iam facing any help appreciated

edited Feb 06 '21 at 19:03

Nikunj Kakadiya

2,689
2
20
35

asked Feb 06 '21 at 09:56

Chanukya

5,833
1
22
36

@blackbishop i tried but still not working for me – Chanukya Feb 06 '21 at 10:30
that's not really an error. it's just a warning and you probably can ignore it – mck Feb 06 '21 at 10:32
@mck spark.driver.memory 12g increased size also but result has to display right ? – Chanukya Feb 06 '21 at 10:33
@Chanukya unless you're using Spark 3 you should set `spark.debug.maxToStringFields` not `spark.sql.debug.maxToStringFields`. (the one mentioned in the warning message you get). – blackbishop Feb 06 '21 at 10:48

Error While fetching columns from join condition in pyspark

0 Answers0