1

I tried to cluster my data using a hierarchy clustering and dendrogram. My dataset has a size of 400000 rows and 90 columns. I also used data splitting and the test_size= 0.2. In addition, I feature scale my data before draw the dendrogram.

Can someone help me with the error? Thanks.

X = customer.iloc[:, [2,3]].values
y = customer.iloc[:,0]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 
0.2, random_state = 0)

from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
X_train = sc_X.fit_transform(X_train)
X_test = sc_X.transform(X_test)
sc_y = StandardScaler()
y_train = sc_y.fit_transform(y_train)

import scipy.cluster.hierarchy as sch
dendrogram = sch.dendrogram(sch.linkage(X_test, method = 'ward'))
plt.title('Dendrogram')
plt.xlabel('Customers')
plt.ylabel('Euclidean distances')
plt.show()

I got an error message: File "C:\Users\anaconda3\lib\site-packages\scipy\cluster\hierarchy.py", line 3433, in _append_singleton_leaf_node ivl.append(str(int(i)))

RecursionError: maximum recursion depth exceeded while getting the str of an object.

Dora11111
  • 11
  • 2

1 Answers1

1

This message error come from one of multiple CPython implementation limitations (same for multithreading do not exists in cpython due to GIL limitation).

In short to fix it, you can do:

sys.setrecursionlimit(100000)

To get more information about this limitation: https://stackoverflow.com/a/13592002/427887

bioinfornatics
  • 1,749
  • 3
  • 17
  • 36