-1

I am running cross validation on dataset and getting

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() 

error at the beginning of second iteration of for loop of kFCVEvaluate. What am I doing wrong?

I looked at other posts about this error, they are about using and/or operators but I don't use any logical operators.

from sklearn.tree import DecisionTreeRegressor
model = DecisionTreeRegressor(random_state = 42, max_depth=5)

def split_dataset(dataset): 
        numFolds=10 
        dataSplit = list()
        dataCopy = list(dataset)
        foldSize = int(len(dataset) / numFolds)
        for _ in range(numFolds):
            fold = list()
            while len(fold) < foldSize:
                index = randrange(len(dataCopy))
                fold.append(dataCopy.pop(index))
            dataSplit.append(fold)
        return dataSplit
    
def kFCVEvaluate(dataset):

    folds = split_dataset(data)
    scores = list()
    for fold in folds:
        trainSet = list(folds)
        trainSet.remove(fold)
        trainSet = sum(trainSet, [])
        testSet = list()
        for row in fold:
            rowCopy = list(row)
            testSet.append(rowCopy)
            
        trainLabels = [row[-1] for row in trainSet]
        trainSet = [train[:-1] for train in trainSet]
        model.fit(trainSet,trainLabels)
        
        actual = [row[-1] for row in testSet]
        testSet = [test[:-1] for test in testSet]
        
        predicted = model.predict(testSet)
        
        accuracy = actual-predicted
        scores.append(accuracy)
        print(scores)
kFCVEvaluate(data)
imdatyaa
  • 45
  • 1
  • 8
  • I looked at other posts about this error, they are about using and/or operators but I don't use any logical operators. – imdatyaa Nov 20 '21 at 17:41
  • where's the traceback? – hpaulj Nov 20 '21 at 20:22
  • Having already asked for traceback, I probably shouldn't be doing this. But if I had to guess, `trainSet.remove(fold)` is the problem line. List `remove` selects the item to remove by value (actually `id` first). So it iterates through the list until it finds an item that `==` the test case. But if the list contains arrays, you'll the `arr1==arr2` test, which will return a bolean array. But the `remove` can only accept a simple True/False. – hpaulj Nov 20 '21 at 22:17
  • The duplicate talks about pandas Series, while your error is for an array. In any case the issue using a multielement boolean in a python context that expects a scalar boolean. `if`, `and`, `or` are common contexts. I don't see any of those here (except the while). The `any/all` fix only applies to a subset of the cases. But the first task is to identify which operation has the error. The actual fix will vary. – hpaulj Nov 20 '21 at 22:21

1 Answers1

0

Testing my remove hypothesis

In [214]: alist = [np.array([1,2,3]), np.ones(3), np.array([4,5])]
In [215]: alist.remove(np.array([1,2,3]))
Traceback (most recent call last):
  File "<ipython-input-215-0b2a68765241>", line 1, in <module>
    alist.remove(np.array([1,2,3]))
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

remove does work if the list contains lists or tuples

In [216]: alist = [[1,2,3], [1,1,1], [4,5]]
In [217]: alist.remove([1,1,1])
In [218]: alist
Out[218]: [[1, 2, 3], [4, 5]]
hpaulj
  • 221,503
  • 14
  • 230
  • 353