0

Right now i have a pyspark data frame as:

x_data  y_data
2.5      2.5
2.5      2.5
2.5      2.5
2.5      2.5

and the value I want in all rows is "Smith"

**How do I create a data frame like this using pyspark?**


x_data  y_data    Name
2.5      2.8      Smith
7.5      5.1      Smith
1.5      1.5      Smith
8.5      6.5      Smith
mck
  • 40,932
  • 13
  • 35
  • 50
emma19
  • 57
  • 2
  • 7

1 Answers1

0

You can use withColumn to add a new literal column:

import pyspark.sql.functions as F

df2 = df.withColumn('Name', F.lit('Smith'))
mck
  • 40,932
  • 13
  • 35
  • 50