How to concat columns using variable in the function in pyspark

Question

I need to append one column to another and have written a function to perform the same:

def concat_content(input_df, left_column, right_columns):
    for col_to_change in right_columns:
        print(col_to_change)
        input_df = input_df.withColumn(F.col(col_to_change), F.concat(F.col(left_column), F.lit(" | "),F.coalesce(F.col(col_to_change), F.lit("None"))))

    return input_df

new_final = concat_content(final, "name_txt", ["group_txt", "sub_group_txt"])

but I am getting error:

TypeError: Column is not iterable

What can I try to resolve this?

`withColumn` need string as a first argument, column name. So just don't wrap your first argument, `col_to_change` in F.col, and you should be fine. — Rayan Ral, May 20 '20 at 17:18
Does this answer your question? [Concatenate columns in Apache Spark DataFrame](https://stackoverflow.com/questions/31450846/concatenate-columns-in-apache-spark-dataframe) — Ani Menon, May 20 '20 at 18:11

score 1 · Accepted Answer · answered May 20 '20 at 18:41

Try this

def concat_content(input_df, left_column, right_columns):
    for col_to_change in right_columns:
        print(col_to_change)
        input_df = input_df.withColumn(col_to_change, F.concat(F.col(left_column), F.lit(" | "),F.coalesce(F.col(col_to_change), F.lit("None"))))

    return input_df

new_final = concat_content(final, "name_txt", ["group_txt", "sub_group_txt"])

With column takes string as first argument not column.

How to concat columns using variable in the function in pyspark

1 Answers1