How can I calculate mean of columns row wise in pyspark?

Asked Mar 22 '18 at 22:43

Active Mar 23 '18 at 09:06

Viewed 1,673 times

I have a data frame of 4 numerical columns and I have to calculate mean of those columns and store the mean in another column in pyspark.

df["mean"] = df.loc[:,d_cols].apply(np.mean, axis=1) (python pandas)

I have to do the same thing as above but in pyspark.

edited Mar 23 '18 at 09:06

pauli

asked Mar 22 '18 at 22:43

user3379108

Hi there, welcome to Stack Overflow. To help others answer your question, please consider editing it to add a minimum reproducible example. See http://stackoverflow.com/help/mcve – HAVB Mar 22 '18 at 23:12
duplicate of [this question](https://stackoverflow.com/questions/32670958/spark-dataframe-computing-row-wise-mean-or-any-aggregate-operation) – pauli Mar 23 '18 at 08:45
Possible duplicate of [Spark DataFrame: Computing row-wise mean (or any aggregate operation)](https://stackoverflow.com/questions/32670958/spark-dataframe-computing-row-wise-mean-or-any-aggregate-operation) – pault Mar 23 '18 at 14:14

0 Answers0