Highest Voted 'k-fold' Questions

27

votes

3 answers

StratifiedKFold vs KFold in scikit-learn

I use this code to test KFold and StratifiedKFold. import numpy as np from sklearn.model_selection import KFold,StratifiedKFold X = np.array([ [1,2,3,4], [11,12,13,14], [21,22,23,24], [31,32,33,34], [41,42,43,44], …

asked Dec 16 '20 at 07:30

user9270170

25

votes

3 answers

Separate pandas dataframe using sklearn's KFold

I had obtained the index of training set and testing set with code below. df = pandas.read_pickle(filepath + filename) kf = KFold(n_splits = n_splits, shuffle = shuffle, random_state = randomState) result = next(kf.split(df), None) #train can be…

python pandas dataframe scikit-learn k-fold

asked Jul 15 '17 at 08:01

Mervyn Lee

1,957
4
28
54

9

votes

1 answer

MemoryError: Unable to allocate 30.4 GiB for an array with shape (725000, 277, 76) and data type float64

It gives that memory error but memory capacity is never reached. I have 60 GB of ram on the SSH and the full dataset process consumes 30 I am trying to train an autoendcoder with k-fold. Without k-fold the training works fine. The raw dataset…

python-3.x memory-management out-of-memory hdf5 k-fold

asked Sep 17 '20 at 07:06

iftekm

93
1
1
3

8

votes

2 answers

Forcing sklearn cross val score to use stratified k fold?

Based on Sklearn Docs: Is it possible to force the use of StratifiedKFold? How can I know which KFold has been used?

python scikit-learn k-fold

asked Nov 22 '19 at 22:40

Marine Galantin

1,634
1
17
28

8

votes

4 answers

Cross validation for MNIST dataset with pytorch and sklearn

I am new to pytorch and are trying to implement a feed forward neural network to classify the mnist data set. I have some problems when trying to use cross-validation. My data has the following shapes: x_train: torch.Size([45000, 784]) and y_train:…

scikit-learn pytorch cross-validation mnist k-fold

asked Nov 22 '19 at 14:25

Kimmen

183
1
1
8

8

votes

1 answer

How to do groupKfold validation and have balanced data?

I'm spliting some data in train and test set according to group values. How can I do this in order to have balanced data? In order to solve a binary classification task I have 100 samples, each one with a unique ID a subject and a label(1 or 0). In…

python pandas machine-learning scikit-learn k-fold

asked Jun 27 '19 at 16:18

Albe

109
4

7

votes

1 answer

How to measure xgboost regressor accuracy using accuracy_score (or other suggested function)

I'm making a code to solve a simple problem of predict the probability of an item missing from an inventory. I'm using the XGBoost prediction model to do this. I have the data split in two .csv files, one with the Train Data and other with the Test…

python scikit-learn xgboost training-data k-fold

asked Dec 03 '19 at 22:25

Pedro Nader

75
1
1
5

6

votes

1 answer

Getting TypeError: Singleton array array(None, dtype=object) cannot be considered a valid collection

I am using different cross validation method. I first use k fold method on my code and it was perfectly well but when I use repeatedstratifiedkfold method it gives me this error TypeError: Singleton array array(None, dtype=object) cannot be…

python machine-learning cross-validation k-fold

asked Jan 07 '21 at 03:51

Rao Kiran

61
1
4

5

votes

1 answer

Should I put shuffle=True or False in sklearn KFold cross validation?

I'm studying some cross_validation scores on my dataset using cross_val_score and KFold In particular my code looks like this: cross_val_score(estimator=model, X=X, y=y, scoring='r2', cv=KFold(shuffle=True)) My question is if it's a common…

scikit-learn k-fold

asked Jul 21 '21 at 11:18

James Arten

523
5
16

5

votes

0 answers

StandardScaler to whole training dataset or to individual folds for Cross Validation

I'm currently using cross_val_score and KFold to assess the impact of using StandardScaler at different points within data pre-processing, specifically whether scaling the entire training dataset prior to performing cross validation introduces data…

python scikit-learn cross-validation k-fold

asked Feb 05 '20 at 16:28

AlexTerry

61
3

4

votes

1 answer

Huggingface Trainer(): K-Fold Cross Validation

I am following this tutorial from TowardsDataScience for text classification using Huggingface Trainer. To get a more robust model I want to do a K-Fold Cross Validation, but I am not sure how to do this with Huggingface Trainer. Is there a built-in…

python cross-validation huggingface-transformers bert-language-model k-fold

asked Feb 20 '23 at 14:25

Maxl Gemeinderat

197
3
14

4

votes

1 answer

How to implement K-Fold Cross validation using Image data generator and using Flow from dataframe (using CSV file)

Please show or explain a dummy example code snippet demonstrating K-Fold Cross Validation with Flow_from_Dataframe, Training_Generator, and Valid_Generator objects for Keras. This is the current code I have (no k-fold only simple fitting…

tensorflow machine-learning keras deep-learning k-fold

asked Nov 04 '20 at 17:18

Bhuvan S

213
1
4
10

4

votes

1 answer

Does GridSearchCV return the best_estimator_ after fitting?

Let's say we tune an SVM with GridSearch like this: algorithm = SVM() parameters = {'kernel': ['rbf', 'sigmoid'], 'C': [0.1, 1, 10]} grid= GridSearchCV(algorithm, parameters) grid.fit(X, y) You then wish to use the best fit parameters/estimator in…

python scikit-learn cross-validation k-fold

asked Dec 29 '19 at 10:02

Bram Vanroy

27,032
24
137
239

3

votes

2 answers

Application and Deployment of K-Fold Cross-Validation

K-Fold Cross Validation is a technique applied for splitting up the data into K number of Folds for testing and training. The goal is to estimate the generalizability of a machine learning model. The model is trained K times, once on each train fold…

machine-learning scikit-learn cross-validation k-fold

asked May 20 '22 at 13:39

notMyName

690
2
6
17

3

votes

2 answers

difference between cross_val_score and KFold

I am learning Machine learning and I am having this doubt. Can anyone tell me what is the difference between:- from sklearn.model_selection import cross_val_score and from sklearn.model_selection import KFold I think both are used for k fold cross…

python machine-learning scikit-learn cross-validation k-fold

asked Aug 19 '21 at 13:34

Tob60

41
1
4

Questions tagged [k-fold]