A dataset is a collection of data, generally represented in tabular form, with columns signifying different variables and rows signify different members of the set. If you are looking for a freely available dataset for any purpose, please consider asking your question on https://opendata.stackexchange.com.
Questions tagged [dataset]
11414 questions
575
votes
5 answers
A simple explanation of Naive Bayes Classification
I am finding it hard to understand the process of Naive Bayes, and I was wondering if someone could explain it with a simple step by step process in English. I understand it takes comparisons by times occurred as a probability, but I have no idea…

Aeonitis
- 5,887
- 3
- 14
- 8
216
votes
12 answers
Should I Dispose() DataSet and DataTable?
DataSet and DataTable both implement IDisposable, so, by conventional best practices, I should call their Dispose() methods.
However, from what I've read so far, DataSet and DataTable don't actually have any unmanaged resources, so Dispose() doesn't…

mbeckish
- 10,485
- 5
- 30
- 55
168
votes
28 answers
How to convert a Scikit-learn dataset to a Pandas dataset
How do I convert data from a Scikit-learn Bunch object to a Pandas DataFrame?
from sklearn.datasets import load_iris
import pandas as pd
data = load_iris()
print(type(data))
data1 = pd. # Is there a Pandas method to accomplish this?

SANBI samples
- 2,058
- 2
- 14
- 20
142
votes
7 answers
Datatable vs Dataset
I currently use a DataTable to get results from a database which I can use in my code.
However, many example on the web show using a DataSet instead and accessing the table(s) through the collections method.
Is there any advantage, performance wise…

GateKiller
- 74,180
- 73
- 171
- 204
141
votes
5 answers
Sample datasets in Pandas
When using R it's handy to load "practice" datasets using
data(iris)
or
data(mtcars)
Is there something similar for Pandas? I know I can load using any other method, just curious if there's anything builtin.

canyon289
- 3,355
- 4
- 33
- 41
117
votes
11 answers
Sort columns of a dataframe by column name
This is possibly a simple question, but I do not know how to order columns alphabetically.
test = data.frame(C = c(0, 2, 4, 7, 8), A = c(4, 2, 4, 7, 8), B = c(1, 3, 8, 3, 2))
# C A B
# 1 0 4 1
# 2 2 2 3
# 3 4 4 8
# 4 7 7 3
# 5 8 8 2
I like to…

John Clark
- 2,639
- 5
- 19
- 13
108
votes
3 answers
What is the difference between "LINQ to Entities", "LINQ to SQL" and "LINQ to Dataset"
I've been working for quite a while now with LINQ. However, it remains a bit of a mystery what the real differences are between the mentioned flavours of LINQ.
The successful answer will contain a short differentiation between them. What is the…

Marcel
- 15,039
- 20
- 92
- 150
99
votes
6 answers
How to delete the first row of a dataframe in R?
I have a dataset with 11 columns with over a 1000 rows each. The columns were labeled V1, V2, V11, etc..
I replaced the names with something more useful to me using the "c" command.
I didn't realize that row 1 also contained labels for each column…

akz
- 1,865
- 2
- 16
- 13
96
votes
4 answers
What does batch, repeat, and shuffle do with TensorFlow Dataset?
I'm currently learning TensorFlow but I came across a confusion in the below code snippet:
dataset = dataset.shuffle(buffer_size = 10 * batch_size)
dataset = dataset.repeat(num_epochs).batch(batch_size)
return…

blue
- 1,695
- 3
- 10
- 17
95
votes
2 answers
How to check if two data frames are equal
Say I have large datasets in R and I just want to know whether two of them they are the same. I use this often when I'm experimenting different algorithms to achieve the same result. For example, say we have the following datasets:
df1 <-…

Waldir Leoncio
- 10,853
- 19
- 77
- 107
91
votes
6 answers
Data Augmentation in PyTorch
I am a little bit confused about the data augmentation performed in PyTorch. Now, as far as I know, when we are performing data augmentation, we are KEEPING our original dataset, and then adding other versions of it (Flipping, Cropping...etc). But…

Fawaz
- 1,253
- 2
- 11
- 9
91
votes
7 answers
How I can filter a Datatable?
I use a DataTable with Information about Users and I want search a user or a list of users in this DataTable. I try it butit don't work :(
Here is my c# code:
public DataTable GetEntriesBySearch(string username,string location,DataTable table)
…

Tarasov
- 3,625
- 19
- 68
- 128
90
votes
4 answers
How to view a DataTable while debugging
I'm just getting started using ADO.NET and DataSets and DataTables. One problem I'm having is it seems pretty hard to tell what values are in the data table when trying to debug.
What are some of the easiest ways of quickly seeing what values have…

Eric Anastas
- 21,675
- 38
- 142
- 236
81
votes
3 answers
Pillow in Python won't let me open image ("exceeds limit")
Just having some problems running a simulation on some weather data in Python. The data was supplied in a .tif format, so I used the following code to try to open the image to extract the data into a numpy array.
from PIL import Image
im =…

Tom Heeley
- 978
- 1
- 7
- 8
76
votes
5 answers
Select method in List Collection
I have an asp.net application, and now I am using datasets for data manipulation. I recently started to convert this dataset to a List collection. But, in some places it doesn't work. One is that in my old version I am using datarow[] drow = …

MAC
- 6,277
- 19
- 66
- 111