0

I have columns in my dataframe like this where the columns starting with 20 were generated dynamically.

enter image description here

I want to rename the columns starting with 20 to 2019_p, 2020_p, 2021_p dynamically.

How do I achieve this?

sparc
  • 345
  • 1
  • 2
  • 13

1 Answers1

2

This should work:

df.select(*[col(c).alias(f"${c}_p") if c.startswith("20") else col(c) for c in df.columns])
Robert Kossendey
  • 6,733
  • 2
  • 12
  • 42
  • Hi @Robert Kossendey, thanks for the answer. I have 2 data frames with same columns one ending without _p and one ending with _p. I want to final select the columns whether without _p or with _p based on a condition using when statement. How do I achieve this? – sparc Oct 31 '22 at 21:52
  • This is a different question @sparc, maybe you can create a new one, this helps people that have the same question as you. – Robert Kossendey Oct 31 '22 at 22:00
  • Hi @Robert Kossendey, I have raised another question. https://stackoverflow.com/questions/74269611/select-columns-based-on-a-condition-pyspark Please can you answer. Thank you. – sparc Oct 31 '22 at 22:11