Questions tagged [resampling]

Resampling is any of a variety of methods for estimating the precision of sample statistics by jackknifing or bootstrapping. In it also used to validating models by using random subsets (bootstrapping, cross validation).

From Wiki

In statistics, resampling is any of a variety of methods for doing one of the following:

Estimating the precision of sample statistics (medians, variances, percentiles) by using subsets of available data (jackknifing) or drawing randomly with replacement from a set of data points (bootstrapping)

Exchanging labels on data points when performing significance tests (permutation tests, also called exact tests, randomization tests, or re-randomization tests)

Validating models by using random subsets (bootstrapping, cross validation)

Common resampling techniques include bootstrapping, jackknifing and permutation tests.

983 questions
219
votes
7 answers

Pandas every nth row

Dataframe.resample() works only with timeseries data. I cannot find a way of getting every nth row from non-timeseries data. What is the best method?
mikael
  • 2,097
  • 3
  • 18
  • 24
56
votes
2 answers

Pandas Resampling error: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex

When using pandas' resample function on a DataFrame in order to convert tick data to OHLCV, a resampling error is encountered. How should we solve the error? # Resample data into 30min bins bars = data.Price.resample('30min', how='ohlc') volumes =…
Nyxynyx
  • 61,411
  • 155
  • 482
  • 830
51
votes
8 answers

Percentiles of Live Data Capture

I am looking for an algorithm that determines percentiles for live data capture. For example, consider the development of a server application. The server might have response times as follows: 17 ms 33 ms 52 ms 60 ms 55 ms etc. It is useful to…
Jason Kresowaty
  • 16,105
  • 9
  • 57
  • 84
47
votes
5 answers

Pandas OHLC aggregation on OHLC data

I understand that OHLC re-sampling of time series data in Pandas, using one column of data, will work perfectly, for example on the following dataframe: >>df ctime openbid 1443654000 1.11700 1443654060 1.11700 ... df['ctime'] =…
user3439187
  • 613
  • 1
  • 7
  • 10
30
votes
5 answers

Resample a numpy array

It's easy to resample an array like a = numpy.array([1,2,3,4,5,6,7,8,9,10]) with an integer resampling factor. For instance, with a factor 2 : b = a[::2] # [1 3 5 7 9] But with a non-integer resampling factor, it doesn't work so easily : c =…
Basj
  • 41,386
  • 99
  • 383
  • 673
26
votes
3 answers

Downsample a 1D numpy array

I have a 1-d numpy array which I would like to downsample. Any of the following methods are acceptable if the downsampling raster doesn't perfectly fit the data: overlap downsample intervals convert whatever number of values remains at the end to a…
TheChymera
  • 17,004
  • 14
  • 56
  • 86
23
votes
4 answers

How can I divide single values of a dataframe by monthly averages?

I have the following 15 minute data as a dataframe for 3 years. With the first two columns being the index. 2014-01-01 00:15:00 1269.6 2014-01-01 00:30:00 1161.6 2014-01-01 00:45:00 1466.4 2014-01-01 01:00:00 1365.6 …
Markus W
  • 1,451
  • 5
  • 19
  • 32
22
votes
1 answer

Resampling irregularly spaced data to a regular grid in Python

I need to resample 2D-data to a regular grid. This is what my code looks like: import matplotlib.mlab as ml import numpy as np y = np.zeros((512,115)) x = np.zeros((512,115)) # Just random data for this test: data = np.random.randn(512,115) #…
Dzz
  • 543
  • 2
  • 8
  • 18
22
votes
3 answers

Pandas' equivalent of resample for integer index

I'm looking for a pandas equivalent of the resample method for a dataframe whose isn't a DatetimeIndex but an array of integers, or maybe even floats. I know that for some cases (this one, for example) the resample method can be substituted easily…
TomCho
  • 3,204
  • 6
  • 32
  • 83
22
votes
6 answers

How do you do bicubic (or other non-linear) interpolation of re-sampled audio data?

I'm writing some code that plays back WAV files at different speeds, so that the wave is either slower and lower-pitched, or faster and higher-pitched. I'm currently using simple linear interpolation, like so: int newlength =…
MusiGenesis
  • 74,184
  • 40
  • 190
  • 334
20
votes
13 answers

Module PIL has not attribute "Resampling"

I have run the same code(with packages I needed) before and it worked, not sure what's happening now. This show the error, AttributeError: module 'PIL.Image' has no attribute 'Resampling'. Probably it's small issue, but I can't figure it out, I am…
ash1
  • 393
  • 1
  • 2
  • 10
20
votes
4 answers

logarithmically spaced integers

Say I have a 10,000 pt vector that I want to take a slice of only 100 logarithmically spaced points. I want a function to give me integer values for the indices. Here's a simple solution that is simply using around + logspace, then getting rid of…
andy
  • 303
  • 1
  • 2
  • 7
19
votes
3 answers

Where can I find a good read about bicubic interpolation and Lanczos resampling?

I want to implement the two above mentioned image resampling algorithms (bicubic and Lanczos) in C++. I know that there are dozens of existing implementations out there, but I still want to make my own. I want to make it partly because I want to…
Vilx-
  • 104,512
  • 87
  • 279
  • 422
18
votes
2 answers

Resampling time series or dataframes with Javascript / Node.js

I need to resample time series in node.js. So I would like to know whether there is a tool in javascript which works similar as pandas in Python? Lets say I have data which looks similar to this example: [{ "time": "28-09-2018 21:29:04", …
sunwarr10r
  • 4,420
  • 8
  • 54
  • 109
15
votes
1 answer

Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index'

I'm trying to resample this Timestamp column of this Dataframe: Transit.head(): Timestamp Plate Gate 0 2013-11-01 21:02:17 4f5716dcd615f21f658229a8570483a8 65 1 2013-11-01 16:12:39…
Dimi
  • 531
  • 3
  • 8
  • 20
1
2 3
65 66