How to run a code in jupyter notebook on multiple cores?

Question

I am trying to implement a K-Neighbours Classification model on a dataset with shape (60000,32,32) on my system (16 GB ram, I5 8th gen processor, 256 GB hard disk). Though I have normalized the data, still predictions are taking enormous amount of time due to the size of the data. Is there any way to utilize multiple cores of my system or increase the allocated RAM on jupyter notebook to save on computational time and speed up calculations ?

Not all parts of `scikit-learn` support parallel processing, but if you use something which has an `n_jobs` parameter, that is what you can try setting to 6. — tevemadar, Jan 13 '20 at 18:06
There is a way of increasing jupyter notebook memory limit, check out this question. [Jupyter notebook memory limit](https://stackoverflow.com/questions/57948003/jupyter-notebook-memory-limit) — Manuel, Jan 13 '20 at 18:12
to speed up computation you could also set the `algorithm` parameter of KNN to `‘ball_tree’` or `‘kd_tree’`, that way you get faster approximate nearest neighbors. Also you could try doing some feature selection or feature engineering, as 60000 1024-dimensional samples seems pretty big and the data will probably have a lower dimensional representation of quite good quality. The samples are 32x32 images? What is the size of the raw data? — dhasson, Jan 14 '20 at 13:43
I did set the algorithm to kd_tree owing to huge dimensionality. Still it got stuck. Yes indeed its the MNIST 32x32 dataset — Sarvagya Dubey, Jan 14 '20 at 15:19

How to run a code in jupyter notebook on multiple cores?

0 Answers0