0

There is a data-frame which consists of 3 columns.

+-----+----+-------+    
| name| id |Subject|    
+-----+---+--------+    
|  one|  1 |Science|    
|  two|  2 |  Maths|    
|three|  3 |Science|   
| four|  4 | random|    
+-----+---+--------+

My requirement is to replace the data of first column with the column name of third column so the result table will be like:

+-------+---+-------+
|   name| id|Subject|
+-------+---+-------+
|Subject|  1|Science|
|Subject|  2|  Maths|    
|Subject|  3|Science|    
|Subject|  4| random|    
+-------+---+-------+

List item

Can someone help me how I can achieve this in pyspark.

Prathik Kini
  • 1,067
  • 11
  • 25
sri
  • 83
  • 2
  • 3
  • 7
  • 1
    Import `lit` from `pyspark.sql.functions` and then do `df = df.withColumn("name", lit(df.columns[2]))` – pault Jun 27 '19 at 02:11
  • Possible duplicate of [How to add a constant column in a Spark DataFrame?](https://stackoverflow.com/questions/32788322/how-to-add-a-constant-column-in-a-spark-dataframe) – pault Jun 27 '19 at 02:11

0 Answers0