is there an pyspark function that similar df.combine_first from Pandas?

Asked May 10 '19 at 14:01

Active May 10 '19 at 14:01

Viewed 242 times

I have 2 columns at pandas dataframe and i create third columns with function pandas.DataFrame.combine_first (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.combine_first.html) .

So now i rewrite my code with pyspark. Is there at pyspark some methods/fuctions to resive similar result ?

asked May 10 '19 at 14:01

FLYNN

Not sure what your requirement is. If you are looking at adding a new column, something like df.withColumn("col3", udf1(params)) should do – ranjith May 10 '19 at 14:04
You can use [`pyspark.sql.functions.coalesce()`](http://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.functions.coalesce). For more details, provide a [reproducible example](https://stackoverflow.com/questions/48427185/how-to-make-good-reproducible-apache-spark-examples). – pault May 10 '19 at 14:08

0 Answers0