Add a column in Spark Dataset from other 2 column

Question

I have a Dataset<Row> in spark just like:

+----+-------+
| age|   name|
+----+-------+
|  15|Michael|
|  30|   Andy|
|  19| Justin|
+----+-------+

Now I want to add a column that has value of string value of age plus string value of name,like:

+----+-------+-----------+
| age|   name|cbdkey     |
+----+-------+-----------+
|  15|Michael|  15Michael|
|  30|   Andy|  30Andy   |
|  19| Justin|  19Justin |
+----+-------+-----------+

I use:

df.withColumn("cbdkey",col("age").+(col("name"))).show()

But all value of new column cbdkey is null. So,How should I do this?Thanks in advance.

score 2 · Answer 1 · answered Nov 24 '17 at 06:54

You can use the concat function:

df.withColumn("cbdkey", concat(col("age"), col("name"))).show
+---+-------+---------+
|age|   name|   cbdkey|
+---+-------+---------+
| 15|Michael|15Michael|
| 30|   Andy|   30Andy|
| 19| Justin| 19Justin|
+---+-------+---------+

If you need to specify a custom separator, use concat_ws:

df.withColumn("cbdkey", concat_ws(",", col("age"), col("name"))).show
+---+-------+----------+
|age|   name|    cbdkey|
+---+-------+----------+
| 15|Michael|15,Michael|
| 30|   Andy|   30,Andy|
| 19| Justin| 19,Justin|
+---+-------+----------+

Thanks you, your answer works. – zpwpal Nov 24 '17 at 07:01 — zpwpal, Nov 24 '17 at 07:01

Prasad Khode · Answer 2 · 2017-11-27T06:26:31.510

2

Other way is to write a UDF (User Defined Function) call this on the dataframe

val concatUDF = udf {
  (age: Int, name: String) => {
    age + name
  }
}

df.withColumn("cbdkey", concatUDF(col("age"), col("name"))).show()

output:

+---+-------+---------+
|age|   name|   cbdkey|
+---+-------+---------+
| 15|Michael|15Michael|
| 30|   Andy|   30Andy|
| 19| Justin| 19Justin|
+---+-------+---------+

edited Nov 27 '17 at 06:26

answered Nov 24 '17 at 07:00

Prasad Khode

6,602
11
44
59

1

Not required here. Spark SQL supports `concat` and `concat_ws` – philantrovert Nov 24 '17 at 07:21

Add a column in Spark Dataset from other 2 column

2 Answers2