1

I am trying to model and fit to noisy data over a long time series and I want to see what happens to my fit if I remove a substantial amount of my data.

I have a long time-series of data and I am only interested in every nth item. However I still want to plot this list over time but with every other unwanted element removed.

For example, for n=4, the list

a = [1,2,3,4,5,6,7,8,9,10...]

Should become

a_new = [1,0,0,0,5,0,0,0,9,0...]

I don't mind if the position of the nth item is at the start or end of the sequence, my series is effectively arbitrary and so long that it won't matter what I delete. For example 'a_new' could also be:

a_new = [0,0,0,4,0,0,0,8,0,0...]

Ideally the solution wouldn't depend on the length of the list, but I can have that length as a variable.

Edit 1:

I actually wanted empty elements, not zero's, (if that's possible?) so:

a_new = [1,,,,5,,,,9...] 

Edit 2:

I needed to remove the corresponding elements from my time series too so that when everything is plotted, each data element has the same index as the time series element.

Thanks!

rh1990
  • 880
  • 7
  • 17
  • 32
  • As Moses suggested, list comprehension is the way to go if you're using `list`s. However, if you're doing analysis of time-series and data in general, `numpy.ndarray`s might be better suited for the job: http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html – Aleksander Lidtke Sep 13 '16 at 09:59
  • I think we can not create list like `a_new = [1,,,,5,,,,9] ` it gives error : `SyntaxError: invalid syntax` – Kalpesh Dusane Sep 13 '16 at 10:25

4 Answers4

5

Use a list comprehension with a ternary conditional that takes the mod of each element on the number n:

>>> a = [1,2,3,4,5,6,7,8,9,10]
>>> n = 4
>>> [i if i % n == 0 else 0 for i in a]
[0, 0, 0, 4, 0, 0, 0, 8, 0, 0]

In case the data does not proceed incrementally, which is most likely, use enumerate so the mod is taken on the index and not on the element:

>>> [v if i % n == 0 else 0 for i, v in enumerate(a)]
[1, 0, 0, 0, 5, 0, 0, 0, 9, 0]

The starting point can also be easily changed when using enumerate:

>>> [v if i % n == 0 else 0 for i, v in enumerate(a, 1)] # start indexing from 1
[0, 0, 0, 4, 0, 0, 0, 8, 0, 0]

If you intend to remove your unwanted data rather than replace them, then a filter using if (instead of the ternary operator) in the list comprehension can handle this:

>>> [v for i, v in enumerate(a, 1) if i % n == 0]
[4, 8]
Community
  • 1
  • 1
Moses Koledoye
  • 77,341
  • 8
  • 133
  • 139
  • This solution only works if the numbers are incrementing in `1`s. Should instead use `val if i % n == 0 else 0 for i, val in enumerate(a)`. – SCB Sep 13 '16 at 09:58
  • That's great thanks a lot! On second thoughts, is it possible to simply have empty elements instead of zero's? Obviously now when I plot it, it looks messy. I've edited the original question to reflect this. – rh1990 Sep 13 '16 at 10:10
  • Or, using boolean operators: `[(1 - (v % n and 1)) and v for v in a]`. – Laurent LAPORTE Sep 13 '16 at 10:12
  • @RichardHall How do you mean empty elements, remove them? – Moses Koledoye Sep 13 '16 at 10:13
  • Because this is a time series, I want the remaining data points to keep their place when plotted against a simple time list t = [1,2,3,4,5...t]. But the elements that you reduced to zero are still plotted and will also affect any fitting algorithm... – rh1990 Sep 13 '16 at 10:16
  • @RichardHall Does the update do what you've just described? – Moses Koledoye Sep 13 '16 at 10:19
  • Thanks, It reduced my data series by the appropriate amount, but then my time series is now 'n' times too large. I guess I can just perform the same on the time series too. – rh1990 Sep 13 '16 at 10:26
  • Yes that works, everything is sorted, thank you so much! – rh1990 Sep 13 '16 at 10:27
0
[0 if i%4 else num for i, num in enumerate(a)]
coder.in.me
  • 1,048
  • 9
  • 19
0

Here's a working example to filter functions given a certain step K:

def filter_f(data, K=4):
    if K <= 0:
        return data

    N = len(data)
    f_filter = [0 if i % K else 1 for i in range(N)]
    return [a * b for a, b in zip(data, f_filter)]

f_input = range(10)

for K in range(10):
    print("Original function: {0}".format(f_input))
    print("Filtered function (step={0}): {1}".format(
        K, filter_f(f_input, K)))
    print("-" * 80)

Output:

Original function: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Filtered function (step=0): [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
--------------------------------------------------------------------------------
Original function: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Filtered function (step=1): [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
--------------------------------------------------------------------------------
Original function: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Filtered function (step=2): [0, 0, 2, 0, 4, 0, 6, 0, 8, 0]
--------------------------------------------------------------------------------
Original function: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Filtered function (step=3): [0, 0, 0, 3, 0, 0, 6, 0, 0, 9]
--------------------------------------------------------------------------------
Original function: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Filtered function (step=4): [0, 0, 0, 0, 4, 0, 0, 0, 8, 0]
--------------------------------------------------------------------------------
Original function: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Filtered function (step=5): [0, 0, 0, 0, 0, 5, 0, 0, 0, 0]
--------------------------------------------------------------------------------
Original function: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Filtered function (step=6): [0, 0, 0, 0, 0, 0, 6, 0, 0, 0]
--------------------------------------------------------------------------------
Original function: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Filtered function (step=7): [0, 0, 0, 0, 0, 0, 0, 7, 0, 0]
--------------------------------------------------------------------------------
Original function: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Filtered function (step=8): [0, 0, 0, 0, 0, 0, 0, 0, 8, 0]
--------------------------------------------------------------------------------
Original function: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Filtered function (step=9): [0, 0, 0, 0, 0, 0, 0, 0, 0, 9]
--------------------------------------------------------------------------------
BPL
  • 9,632
  • 9
  • 59
  • 117
0

Alternatively, irrespective of the programming language, you can use function:

$f(i)=(1-(-1)^{\floor((i-1)/k)+\floor(i/k)})/2$

This function produces 1 every k-th element. For k=4, this generates

f(i)=[0,0,0,1,0,0,0,1,0,0,0,1] for i=[1,2,3,4,5,6,7,8,9,10,11,12]

The function that you want would be then i*f(i).

Suraj Rao
  • 29,388
  • 11
  • 94
  • 103