I am analysing a k-means clustering algorithm in pyspark and I have a syntax doubt. This is the relevant part of the code:
from pyspark.ml.clustering import KMeans
from pyspark.ml.clustering import KMeansModel
import numpy as np
kmeans_modeling = KMeans(k = 5, seed = 0)
model = kmeans_modeling.fit(data.select("parameters"))
What does the seed = 0
mean? Certainly we cannot initialize all the clusters with the seed on the same point, or we wouldn't obtain distinct clusters right?