Pyspark/R: is there a pyspark equivalent function for R's is.na?

Asked Jun 29 '21 at 10:45

Active Jun 29 '21 at 10:45

Viewed 93 times

In R, the is.na() function returns a dataset where Null values are true, while Not Null values are false:

col1	col2
Null	1
1	Null
Null	Null
1	1

is.na() -->

col1	col2
True	False
False	True
True	True
False	False

I'm wondering if there is an equivalent pyspark function that returns the dataframe, populated with True/False values, I do not want to use pyspark filter/where as that will not return the full dataset.

Thanks in advance!

PS: If my formatting is off, please let me know, this is my first stack overflow post so not 100% sure how the formatting works

asked Jun 29 '21 at 10:45

kxxlxn

Possibly relevant: https://stackoverflow.com/q/37262762/3358272 and https://sparkbyexamples.com/pyspark/pyspark-filter-rows-with-null-values/ – r2evans Jun 29 '21 at 13:12
1

Thanks @r2evans, that's not quite what I was looking for - as those both filter records out of the dataset, instead I found solution: `df.withColumn('col1', when(df['col1'].isNull(), lit('True')).otherwise(lit('False')))` and repeating for col2 --> This returns the whole dataset with any null values replaced with True, and any non-null values replaced with False – kxxlxn Jun 30 '21 at 15:29

Pyspark/R: is there a pyspark equivalent function for R's is.na?

0 Answers0