I see couple of posts post1 and post2 which are relevant to my question. However while following post1 solution I am running into below error.
joinedDF = df.join(df_agg, "company")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/spark/python/pyspark/sql/dataframe.py", line 1050, in join
jdf = self._jdf.join(other._jdf, on, how)
AttributeError: 'NoneType' object has no attribute '_jdf'
Entire code snippet
df = spark.read.format("csv").option("header", "true").load("/home/ec2-user/techcrunch/TechCrunchcontinentalUSA.csv")
df_agg = df.groupby("company").agg(func.sum("raisedAmt").alias("TotalRaised")).orderBy("TotalRaised", ascending = False).show()
joinedDF = df.join(df_agg, "company")