2

I use a function that takes raw dataset and returns train data and test data. (Doesn't only split the data, but also does some slicing, shuffling, processing on data)

def create_dataset():
            ...
            ...

            train_data = tf.data.Dataset.from_tensor_slices((x_train, y_train))
            train_data = train_data.cache().shuffle(buffer_size).batch(batch_size).repeat()
            test_data = tf.data.Dataset.from_tensor_slices((x_test, y_test))
            test_data = test_data.batch(batch_size).repeat() 

            return train_data,test_data

My target is to make a list of the tuples of train and test data returned from the function. Which i tried kinda look like this.

td = []
vd = []
for k in range(0,5):
    td[k],vd[k] = create_dataset()

    
datasets = [(td[0],vd[0]),(td[1],vd[1]),(td[2],vd[2]),(td[3],vd[3]),(td[4],vd[4])]

But it seems i can not store data like this. How would I create a list of tuples of my (train_data,test_data)? Thanks in advance.

RajatRaja
  • 146
  • 8
  • Please give more details: what is the error/problem you have ? can you include error trace if any or give some details ? If your lists are empy you have to use append(...) to add a avlue to this list, or init it at the right size first. You cannot access/read/write td[k] if the k th element is not already existing/created/initialized – Malo Jan 02 '22 at 13:46
  • You can use `zip` https://stackoverflow.com/questions/2407398/how-to-merge-lists-into-a-list-of-tuples – Digvijay S Jan 02 '22 at 13:49
  • Does this answer your question? [How to merge lists into a list of tuples?](https://stackoverflow.com/questions/2407398/how-to-merge-lists-into-a-list-of-tuples) – Digvijay S Jan 02 '22 at 13:49

2 Answers2

1

I don't know if I miss something here, but this should work for your goal:

datasets = []
for _ in range(5):
   x, y = create_dataset()
   datasets.append((x,y))
Rabinzel
  • 7,757
  • 3
  • 10
  • 30
1

If your lists are empy you have to use append(...) to add a avlue to this list, or init it at the right size first. You cannot access/read/write td[k] and vd[k] if the k th elements are not already existing/initialized.

One solution is to init your lists with 5 empty tuples, so it will work,

td = [(), (), (), (), () ]
vd = [(), (), (), (), () ]
for k in range(0,5):
    td[k],vd[k] = create_dataset()

    
datasets = [(td[0],vd[0]),(td[1],vd[1]),(td[2],vd[2]),(td[3],vd[3]),(td[4],vd[4])]

Another solution is to build the datasets list directlty from an empty list and append the results of create_dataset() calls:

datasets = []
for k in range(0,5):
    datasets.append(create_dataset())
Malo
  • 1,233
  • 1
  • 8
  • 25