I'm new to PySpark and I see there are two ways to select columns in PySpark, either with ".select()" or ".withColumn()".
From what I've heard ".withColumn()" is worse for performance but otherwise than that I'm confused as to why there are two ways to do the same thing.
So when am I supposed to use ".select()" instead of ".withColumn()"?
I've googled this question but I haven't found a clear explanation.