The first and the last number between None's in a python array

Question

I have a numpy array that looks like this

[None, None, None, None, None, 8, 7, 2, None, None , None , None, None, 
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8 7 , 9, 12 , 74, None.......]

I need always first and the last value between the None's, the result should look like this

[8,2,169,23,111,74,...]

Does any one know how i can get these numbers back easily ?

What should be the result for `[None, None,1, 2, 3, 4]`? `[4,5,6,7,None]`? `[8,None,9,None,10]`? — Eric, Dec 18 '16 at 23:28
@Eric This is a good question, I had done some backtesting and my result was that this [8,None,9,None,10] will never happen there will always at least two numbers next to each other. The result of your question are [None, None,1, 2, 3, 4] => [1,4] , [4,5,6,7,None] => [4,7] , [8,1,None,9,2,None,10,2] => [8,1,9,2,10,2] — user3371372, Dec 19 '16 at 21:53

Mike Müller · Answer 1 · 2016-12-19T07:36:14.737

A NumPy array that contains None and integers will be of typeobject anyway. Seems like it it easier to use a list in the first place:

res = []
for x1, x2 in zip(L[:-1], L[1:]):
    if (x1 is not None and x2 is None):
        res.append(x1)
    elif (x1 is None and x2 is not None): 
        res.append(x2)

None res is:

[8, 2, 169, 23, 111, 74]

To avoid wrong results when the list does not start or end with None, limit the search into the part between the first and last None:

res = []
start = L.index(None)
end = len(L) - L[::-1].index(None)

for x1, x2 in zip(L[start:end-1], L[start+1:end]):
    if (x1 is not None and x2 is None):
        res.append(x1)
    elif (x1 is None and x2 is not None): 
        res.append(x2)

If you have a NumPy array with NaN instead of None:

a = np.array([np.nan, np.nan, np.nan, np.nan, np.nan, 8, 7, 2, np.nan,
              np.nan , np.nan , np.nan, np.nan, np.nan, np.nan, 169, 37, 9,
              7 ,23, np.nan , np.nan , 111, 24, 8, 7 , 9, 12 , 74, np.nan])

You can do this in a vectorized way:

b = a[np.isnan(np.roll(a, 1)) | np.isnan(np.roll(a, -1))]
res = b[~np.isnan(b)]

Now res looks like this:

array([   8.,    2.,  169.,   23.,  111.,   74.])

Again, a version with limited search between first and last NaN:

indices = np.arange(len(a))[np.isnan(a)]
short = a[indices[0]:indices[-1]]
b = short[np.isnan(np.roll(short, 1)) | np.isnan(np.roll(short, -1))]
res = b[~np.isnan(b)]

Buggy with opening/trailing numbers e.g. [1, 2, None, 3, 4] -> [2, 3], whereas desired -> [] — innisfree, Dec 18 '16 at 23:51
I like the vectorized way, very nice. Easy enough to `map` the `None`s to `np.nan`s. — Matt Messersmith, Dec 18 '16 at 23:53
@MikeMüller Thanks for the code it's brilliant but had some problems plotting nan, this way i'm using none. — user3371372, Dec 19 '16 at 20:22
You can vectorize it with `None`s as well - see my answer. There is a performance penalty compared with using `nan`, but it ought to be faster than a pure-python solution — Eric, Dec 19 '16 at 21:23

alex314159 · Answer 2 · 2016-12-19T09:49:00.077

1

Using the pandas package - thanks to innisfree for mentioning bug if series doesn't start/end with None:

import pandas
x=numpy.array([1,3,4,None, None, None, None, None, 8, 7, 2, None, None,7,8])
z = pandas.Series(numpy.append(numpy.insert(x,0,None),None))
res = z[z.shift(1).isnull() | z.shift(-1).isnull()].dropna()

edited Dec 19 '16 at 09:49

answered Dec 18 '16 at 23:16

alex314159

3,159
2
20
28

Buggy with opening/trailing numbers e.g. [1, 2, None, 3, 4] -> [2, 3], whereas desired -> [] – innisfree Dec 18 '16 at 23:54
@innisfree: Nope, desired is `[1, 2, 3, 4]` - see the comments on the question – Eric Dec 19 '16 at 21:24

score 1 · Answer 3 · edited May 23 '17 at 12:25

Stealing a lot from this answer you can do this:

Convert your Nones to nan:

x = [None, None, None, None, None, 8, 7, 2, None, None , None , None, None, 
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8 7 , 9, 12 , 74, None]

x = np.array(x,dtype=np.float)

and then:

x = np.vstack([a[s].take([0,-1]) for s in np.ma.clump_unmasked(np.ma.masked_invalid(x))]).flatten()

This divides your array into arrays that correspond to contiguous groups of non-nan values. Then it gets the first and last elements in these arrays using .take([0,-1]). Then it stacks these arrays into one array and flattens it.

print(repr(x))

array([   8.,    2.,  169.,   23.,  111.,   74.])

score 1 · Answer 4 · answered Dec 19 '16 at 00:42

1

a = [None, None, None, None, None, 8, 7, 2, None, None , None , None, None, 
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8, 7 , 9, 12 , 74, None]

a.append(None)
[a[e] for e in range(len(a)-1) if a[e]!=None and (a[e-1]==None or a[e+1]==None)]

Output:

[8, 2, 169, 23, 111, 74]

answered Dec 19 '16 at 00:42

JCA

167
4

I know this is not numpy but this code is nice, compact, and beautiful. Thanks @JCA – user3371372 Dec 19 '16 at 20:34
This throws an `IndexError` if `a[0] is None` – Eric Dec 19 '16 at 21:06

Eric · Accepted Answer · 2016-12-20T01:22:50.017

1

A vectorized way with numpy:

arr = np.asarray(arr)
# find all the nones - need the np.array to work around backwards-compatible misbehavior
is_none = arr == np.array(None)

# find which values are on an edge. Start and end get a free pass for being not none
is_edge = ~is_none
is_edge[1:-1] &= (is_none[2:] | is_none[:-2])

# use the boolean mask as an index
res = arr[is_edge]

is_edge could be calculated more verbosely but perhaps more clearly as:

is_edge = ~is_none & (
    np.pad(is_none[1:], (0,1), mode='constant', constant_values=True)
    |
    np.pad(is_none[:-1], (1,0), mode='constant', constant_values=True)
)

edited Dec 20 '16 at 01:22

answered Dec 19 '16 at 21:14

Eric

95,302
53
242
374

I'm getting a error if use the boolean as mask ( only integer arrays with one element can be converted to an index). This is how fix it newArr = np.where(~is_edge, 0, arr), res = newArr[newArr != 0] – user3371372 Dec 19 '16 at 23:21
@user3371372: see my update. You need an `arr = np.asarray(arr)` first – Eric Dec 20 '16 at 01:23

Rockybilly · Answer 6 · 2016-12-20T03:36:33.847

0

Here how I would do it with regular Python lists. I don't have a specific answer with numpy arrays.

result = []
for ind, n in enumerate(lst):
    if n is not None and (lst[ind-1] is None or lst[ind+1] is None):
        result.append(n)

Note: This will give incorrect result if index 0 or len - 1(last element) is a number.

edited Dec 20 '16 at 03:36

answered Dec 18 '16 at 22:59

Rockybilly

2,938
1
13
38

I think it may be because this raises an `IndexError` if the last item in the list is not `None`. – Andrew Guy Dec 19 '16 at 00:39
You could add `if n and (0 < ind < len(lst)-1) and ...` which should fix it. Otherwise quite a simple, elegant answer. – Andrew Guy Dec 19 '16 at 00:46
This is the same as the accepted answer, but better because it uses `is` – Eric Dec 19 '16 at 21:26
@AndrewGuy The note I left actually meant for that issue, it is not hard to overcome, so I did not bother implementing it. – Rockybilly Dec 20 '16 at 03:36

innisfree · Answer 7 · 2016-12-18T23:48:23.030

You could use list comprehension:

ex = [None, None, None, None, None, 8, 7, 2, 
      None, None , None , None, None, None, None, 169, 37, 9 ,7 ,23,
      None , None , 111, 24, 87 , 9, 12 , 74, 
      None]

filt = [y for x, y, z in zip(ex, ex[1:], ex[2:]) 
        if y is not None and (x is None or z is None)]

# [8, 2, 169, 23, 111, 74]

This requires no external dependencies; however, making two extra copies for my zipped iterator could be costly if the list is particularly big. There are probably ways to overcome this with e.g. itertools.

Note that if your original list has leading or trailing numbers, the above could fail. Strip them first, e.g.

while ex[0] is not None:
    del ex[0]

while ex[-1] is not None:
    del ex[-1]

score 0 · Answer 8 · answered Dec 18 '16 at 23:51

0

The beauty of zip is that you can do this "triple iteration" kind of thing (loop through the list considering three items at a time):

result = []
for previous, item, next in zip(x[1:], x, np.hstack((None, x))):
    if item and None in (previous, next):
        result.append(item)

The other answers are also reasonable, this was an attempt at readability/understandability.

answered Dec 18 '16 at 23:51

Matt Messersmith

12,939
6
51
52

Buggy with opening/trailing numbers e.g. [1, 2, None, 3, 4] -> [2, 3], whereas desired -> [] – innisfree Dec 18 '16 at 23:55
1

@innisfree The OP didn't make it particularly clear what to do in the case you're talking about. Thankfully the question is targeted on a specific piece of data that has leading/trailing `None`s – Matt Messersmith Dec 18 '16 at 23:59
It's clear - 'I need always first and the last value between the None's'. It doesn't say, I need any value that is adjacent to a None. – innisfree Dec 19 '16 at 00:03
1

The op has clarified their intent, and they want `[1, 2, 3, 4]` as the output for that case – Eric Dec 19 '16 at 21:26
Mixing `np` and `zip` is kinda strange – Eric Dec 19 '16 at 21:27

MSeifert · Answer 9 · 2017-04-04T20:06:40.420

It's not exactly numpy and it involves a function from another external package: iteration_utilities.split but if you're operating on a list it will probably be relativly fast:

>>> lst = [None, None, None, None, None, 8, 7, 2, None, None , None , None, None, 
       None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8, 7 , 9, 12 , 74, None]

>>> from iteration_utilities import Iterable, is_None
>>> from operator import itemgetter
>>> Iterable(lst).split(is_None           # split by None
                ).filter(None             # remove empty lists
                ).map(itemgetter(0, -1)   # get the first and last element of each splited part
                ).flatten(                # flatten the result
                ).as_list()               # and convert it to a list
[8, 2, 169, 23, 111, 74]

Note that you can also do (something like) this in pure Python as well:

def first_last_between_None(iterable):
    last = None
    for item in iterable:
        # If we're in the None-part just proceed until we get a not None
        if last is None:
            if item is None:
                continue
            else:
                # Not None, reset last and yield the current value
                last = item
                yield item
        else:
            # If the next item is None we're at the end of the number-part
            # yield the last item.
            if item is None:
                yield last
            last = item
    if last is not None:
        yield last

>>> list(first_last_between_None(lst))
[8, 2, 169, 23, 111, 74]

If you want to discard the first and last values when it didn't start or end with None simply take the appropriate slice:

if lst[0] is not None:
    res = res[2:]
if lst[-1] is not None:
    res = res[:-2]

Buggy with opening/trailing numbers e.g. [1, 2, None, 3, 4], desired -> [] — innisfree, Dec 18 '16 at 23:57
@innisfree Sorry if I input `[1,2,None,3,4]` it results in `[1, 2, 3, 4]`. Why do you think it should be `[]`? — MSeifert, Dec 18 '16 at 23:58
None of the numbers in the list [1, 2, None, 3, 4] lie between two None's. — innisfree, Dec 18 '16 at 23:59
@innisfree Given that the OP had a list which started and "ended" with `None` it didn't seem so important. I've included a solution for this in the answer. Thanks for bringing it to my attention. — MSeifert, Dec 19 '16 at 00:11
@tom Why? It needs to discard two elements (first, last) of each block, because these aren't wrapped by Nones. — MSeifert, Dec 19 '16 at 13:37
@tom I don't need to know! You can simply run the code and check if it's returning wrong result. It would be easier for me to understand what isn't working if you can given me an explicit input where it fails. — MSeifert, Dec 19 '16 at 16:09

The first and the last number between None's in a python array

9 Answers9