0

I want to sum up all columns that have a specific string in name ('scr_') into a new column. I checked this How can I sum multiple columns in a spark dataframe in pyspark?, but for me it is not working ... whats going wrong?

from pyspark.sql.types import *
import pyspark.sql.functions as F
from pyspark.sql import Window 
from functools import reduce
from operator import add

def script():
  points = [s for s in df.columns if 'scr_' in s]
  df = df.withColumn('TOTAL', F.sum([df[col] for col in points]))

This gives me: TypeError: 'Column' object is not callable

Thank you!

Niels
  • 141
  • 8

0 Answers0