I tried the following agglomerative clustering in the Jupyter notebook.
The shape of my dataset is (406829, 8)
.
I Tried the following code:
import pandas as pd
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
import os
from sklearn.preprocessing import StandardScaler, LabelEncoder
import scipy.cluster.hierarchy as shc
from sklearn.cluster import AgglomerativeClustering
# Apply the agglomerative clustering with ward linkage
aggloclust = AgglomerativeClustering(affinity='euclidean',linkage='ward', memory=None, n_clusters=5).fit(data)
print(aggloclust)
# Agglomerative clustering labels
labels = aggloclust.labels_
# Show the clusters on the graph
plt.scatter(x[:,0], x[:,1], c=labels)
plt.show()
Then I ran into an error - MemoryError: Unable to allocate 617. GiB for an array with shape (82754714206,) and data type float64
I am working on windows machine with 16GB RAM. Python version - 3.8.5 Can anyone tell me how can I resolve this issue.
I tried to google this error and got the solution - To create the jupyter config file and then to update the max_buffer_size in that file I found it here - How to increase Jupyter notebook Memory limit?
I tried the solution provided in the above link but it did not work. Please help me.