Questions tagged [dataloader]

DataLoader is a generic utility to be used as part of your application's data fetching layer to provide a consistent API over various backends and reduce requests to those backends via batching and caching.

GitHub: dataloader

430 questions
31
votes
1 answer

How to use 'collate_fn' with dataloaders?

I am trying to train a pretrained roberta model using 3 inputs, 3 input_masks and a label as tensors of my training dataset. I do this using the following code: from torch.utils.data import TensorDataset, DataLoader, RandomSampler,…
Sam V
  • 479
  • 1
  • 4
  • 11
27
votes
1 answer

What does next() and iter() do in PyTorch's DataLoader()

I have the following code: import torch import numpy as np import pandas as pd from torch.utils.data import TensorDataset, DataLoader # Load dataset df = pd.read_csv(r'../iris.csv') # Extract features and target data =…
Leockl
  • 1,906
  • 5
  • 18
  • 51
18
votes
3 answers

Load data into GPU directly using PyTorch

In training loop, I load a batch of data into CPU and then transfer it to GPU: import torch.utils as utils train_loader = utils.data.DataLoader(train_dataset, batch_size=128, shuffle=True, num_workers=4, pin_memory=True) for inputs, labels in…
Khiem Le
  • 183
  • 1
  • 1
  • 4
16
votes
2 answers

pytorch DataLoader extremely slow first epoch

When I create a PyTorch DataLoader and start iterating -- I get an extremely slow first epoch (x10--x30 slower then all next epochs). Moreover, this problem occurs only with the train dataset from the Google landmark recognition 2020 from Kaggle. I…
Slavka
  • 1,070
  • 4
  • 13
  • 28
16
votes
3 answers

How to get entire dataset from dataloader in PyTorch

How to load entire dataset from the DataLoader? I am getting only one batch of dataset. This is my code dataloader = torch.utils.data.DataLoader(dataset=dataset, batch_size=64) images, labels = next(iter(dataloader))
Aakanksha W.S
  • 161
  • 1
  • 1
  • 6
15
votes
1 answer

How does the __getitem__'s idx work within PyTorch's DataLoader?

I'm currently trying to use PyTorch's DataLoader to process data to feed into my deep learning model, but am facing some difficulty. The data that I need is of shape (minibatch_size=32, rows=100, columns=41). The __getitem__ code that I have within…
Sean
  • 2,890
  • 8
  • 36
  • 78
13
votes
1 answer

How do they know mean and std, the input value of transforms.Normalize

The question is about the data loading tutorial from the PyTorch website. I don't know how they write the value of mean_pix and std_pix of the in transforms.Normalize without calculation I'm unable to find any explanation relevant to this question…
haofeng
  • 592
  • 1
  • 5
  • 21
13
votes
1 answer

Should GraphQL DataLoader wrap request to database or wrap requests to service methods?

I have very common GraphQL schema like this (pseudocode): Post { commentsPage(skip: Int, limit: Int) { total: Int items: [Comment] } } So to avoid n+1 problem when requesting multiple Post objects I decided to use Facebook's…
WelcomeTo
  • 19,843
  • 53
  • 170
  • 286
13
votes
5 answers

How to show a progress Dialog before data loading in flutter?

In my Flutter project, I am doing API calls to fetch data by GET request. After parsing the JSON object from the response, I just show the value in the Text widget. While the data takes time to load, in the meantime my Text widgets show null. For…
S. M. Asif
  • 3,437
  • 10
  • 34
  • 58
11
votes
2 answers

load pytorch dataloader into GPU

Is there a way to load a pytorch DataLoader (torch.utils.data.Dataloader) entirely into my GPU? Now, I load every batch separately into my GPU. CTX = torch.device('cuda') train_loader = torch.utils.data.DataLoader( train_dataset, …
Jonas De Schouwer
  • 755
  • 1
  • 9
  • 15
11
votes
2 answers

Number of instances per class in pytorch dataset

I'm trying to make a simple image classifier using PyTorch. This is how I load the data into a dataset and dataLoader: batch_size = 64 validation_split = 0.2 data_dir = PROJECT_PATH+"/categorized_products" transform =…
Amin Bashiri
  • 198
  • 1
  • 2
  • 16
11
votes
6 answers

pytorch collate_fn reject sample and yield another

I have built a Dataset, where I'm doing various checks on the images I'm loading. I'm then passing this DataSet to a DataLoader. In my DataSet class I'm returning the sample as None if a picture fails my checks and i have a custom collate_fn…
Brian Formento
  • 731
  • 2
  • 9
  • 24
10
votes
2 answers

How to get the total number of batch iteration from pytorch dataloader?

I have a question that How to get the total number of batch iteration from pytorch dataloader? The following is a common code for training for i, batch in enumerate(dataloader): Then, is there any method to get the total number of iteration for the…
Hyunseung Kim
  • 493
  • 1
  • 6
  • 17
10
votes
1 answer

Best way to handle one-to-many with type-graphql typeorm and dataloader

I'm trying to figure out the best way to handle a one-to-many relationship using type-graphql and typeorm with a postgresql db (using apollo server with express). I have a user table which has a one-to-many relation with a courses table. The way I…
Joel Jacobsen
  • 313
  • 1
  • 5
  • 15
10
votes
1 answer

PyTorch: Speed up data loading

I am using densenet121 to do cat/dog detection from Kaggle dataset. I enabled cuda and it appears that training is very fast. However, the data loading (or perhaps processing) appears to be very slow. Are there some ways to speed it up? I tried to…
gruszczy
  • 40,948
  • 31
  • 128
  • 181
1
2 3
28 29