2

I am trying to do a 10-fold cross validation for a sentiment classifier. For this purpose, I have created a list of 10 lists, where each list corresponds to one of the folds and contains movie reviews.

I am trying to create a loop, where for every iteration I use 9 folds for training the classifier and one fold for testing it. However, I am facing difficulties with subsetting the list of lists to create two variables (one for the fold, and one for the 9 remaining folds), which I can pass through my train and test functions.

I created this example as a more-readable version of my code:

list1 = [{"ID":1, "sentiment":"positive", "content": "further lists within lists"}]
list2 = [{"ID":2, "sentiment":"positive", "content": "further lists within lists"}]
list3 = [{"ID":3, "sentiment":"positive", "content": "further lists within lists"}]
list4 = [{"ID":4, "sentiment":"positive", "content": "further lists within lists"}]
list5 = [{"ID":5, "sentiment":"positive", "content": "further lists within lists"}]

list_of_lists = [list1, list2, list3, list4, list5]

for list_ in list_of_lists:
  remaining_lists = list_of_lists[~list_]
  train_classifier(remaining_lists)
  test_classifier(list_)

The error I get is "bad operand type for unary ~: 'list'". I have seen the answers to a related question at Index all *except* one item in python, but I could not implement the solutions suggested in a loop.

Mr_Bull3t
  • 67
  • 1
  • 6

3 Answers3

2

Got it!

  1. What you are doing:
    list_of_list = [ [10], [20] , [30] , [40] ]

    for sub_lst in list_of_list:
        print( list_of_list[ ~sub_lst ] )
  1. What you are getting: TypeError: bad operand type for unary ~: 'list'

  2. Reason: Because you can give only an index in list_of_list[ index_here ] index might be simple in range 0 to length-1 or index can be negated using ~ but you cannot give an instance of a list in index's place while iterating over list_of_list you get an instance of each list (sub_lst) inside list_of_list[ index_here ] one after another

# check type as follow
>>> type(list_of_list[0]) 
    <class 'list'>
  1. How will you achieve the desired outcome:
list_of_list = [ [10], [20] , [30] , [40] ]
for index,sub_lst in enumerate(list_of_list):
    remaining_lst = list_of_list[:index] + list_of_list[index+1:]

    print("Sub list: ", sub_lst)
    print("Remaining list: ", remaining_lst)
    print()
  1. Results:
Sub list:  [10]
Remaining list:  [[20], [30], [40]]

Sub list:  [20]
Remaining list:  [[10], [30], [40]]

Sub list:  [30]
Remaining list:  [[10], [20], [40]]

Sub list:  [40]
Remaining list:  [[10], [20], [30]]

I hope it helps !!!

Himanshu Patel
  • 568
  • 6
  • 14
1

Loop over the indices of the outer list (list_of_lists) and exclude the list at that index on each iteration of the loop.

Example:

for i in range(len(list_of_lists)):
    remaining_lists = list_of_lists[0:i] + list_of_lists[i+1:]
    train_classifier(remaining_lists)
    excluded_list = list_of_lists[i]
    test_classifier(excluded_list)

Resulting IDs of lists per iteration:

Remaining    | Excluded
-------------+---------
[2, 3, 4, 5] | 1
[1, 3, 4, 5] | 2
[1, 2, 4, 5] | 3
[1, 2, 3, 5] | 4
[1, 2, 3, 4] | 5
Henry Woody
  • 14,024
  • 7
  • 39
  • 56
1

you can do this as follows:

for list_ in list_of_lists:
  newlist=list_of_lists.copy()
  newlist.remove(list_)
  newlist.remove()
  train_classifier(newlist)
  test_classifier(list_)
jottbe
  • 4,228
  • 1
  • 15
  • 31