0

I would like to add values to an array if they occur after a 100. This works fine if the column which contains a 100 isn't the last column (as the next columns value in that row will be appended to an array). But an IndexError occurs if the 100 value occurs in the last column as there's no lastcol + 1 value.

col1 col2 col3
nan  100  60
100  95   98
nan  nan  100

Now:

values = [60,95,IndexError]; 

Ideal:

values = [60,95,100];

My code:

x1 = np.where(#table == 100.0)[0]; x2 = np.where(#table == 100.0)[1]

# np.where returns a tuple containing the (x,y) locations of   
the 100 values in the table. e.g. [(0,1,2),(1,0,2)] for the  
table in the above example.

for i,j in zip(x1,x2):
   values.append(out[i][j+1]);
# Above attempts to add values

EDIT

col1 col2 col3 col4
nan  100  60   50
100  95   98   70
nan  nan  100  80
nan  nan  nan  100
nan  nan  100  100

Desired: Get values after 100 occurs in row and append it to the 'values' array. Also, note that nan will occur in each row before a 100 values occurs.

values = [60,95,80,100,100];

The values above occur in each row after 100 (order is important).

Black
  • 4,483
  • 8
  • 38
  • 55
  • 5
    Can you clarify the problem description a bit? – mshsayem Feb 24 '14 at 06:42
  • @mshsayem, my code tries to add a value to an array if it's directly after a '100.0'. However, it can't do this for values which are in the last column as there's no cell after it. Hence, the IndexError. I would like to add the last value i.e. 100 to my array if it occurs in the last column. – Black Feb 24 '14 at 14:22

1 Answers1

1

I'm not sure what you expect exactly, because I don't understand how you can get values = [60,95,80,100,100];

But I guess you were close to the solution

Actually expected value !

Rule : We want the first value after the first occurrence of 100, and if first value of 100 is the last one, then take it.

In [1]: A
Out[1]: 
array([[  nan,  100.,   60.,   50.],
       [ 100.,   95.,   98.,   70.],
       [  nan,   nan,  100.,   80.],
       [  nan,   nan,   nan,  100.],
       [  nan,   nan,  100.,  100.],
       [ 100.,    4.,  100.,    5.]])


In [2]: B = np.where(A==100)

As A is 2D :

  • B[0] is lines position of 100 in A
  • B[1] is columns position of 100 in A

Then :

In [3]: value = []
In [4]: for j in set(B[0]): # set returns unique values in array B[0]
        idx = A[j].tolist().index(100) # I get the index of the first occurrence of 100  in line j of matrix A (list.index(value,[start,[stop]]) gives position of first occurrence from start to stop in list )
        if idx+1 >= len(A[j]): # if it's last column ... 
            value.append(A[(j,idx)]) # ... add 100
        else:
            value.append(A[(j,idx+1)]) # if not, add value after 100

 In [5]: value
 Out[5]: [60.0, 95.0, 80.0, 100.0, 100.0, 4.0]

Expected value 1

with ternary operator

If your expected value is [ 60., 95., 80., 100., 100., 100.] (Take first value after 100 or take 100 if 100 is last)

Then :

In [1]: A
Out[1]: 
array([[  nan,  100.,   60.,   50.],
       [ 100.,   95.,   98.,   70.],
       [  nan,   nan,  100.,   80.],
       [  nan,   nan,   nan,  100.],
       [  nan,   nan,  100.,  100.]])

In [2]: B = np.where(A==100)

In [3]: A[(B[0],[b+1 if b+1<len(A[0]) else b for b in B[1]])]
Out[3]: array([  60.,   95.,   80.,  100.,  100.,  100.])

Expected value 2

If your expected value is [ 60., 95., 80., 100.], (take first value after 100 and if 100 is last take nothing) then :

value = []
for i,j in zip(B[0],B[1]):
    if j+1<len(A[0]):
         value.append(A[i,j+1])

value being now [60.0, 95.0, 80.0, 100.0]

Tell me if you don't understand something.

Community
  • 1
  • 1
jrjc
  • 21,103
  • 9
  • 64
  • 78
  • The desired result is the value after the first occurrence of 100. I think the first solution may be most appropriate but the issue is that numpy.where does not give the first occurrence of 100.0 in that row. Only the occurrences of all 100.0. – Black Feb 24 '14 at 23:37
  • @Blackholify Ahh, ok, so the first value after the first occurrence of 100 or the last column if it's 100 ? In your example, // nan nan nan **100** // nan nan 100 **100** // You get the 2 bold **100** because the first is last and the second is after the first occurrence of 100 in the last line ? – jrjc Feb 25 '14 at 08:34
  • yes exactly that. I can currently get values like the last line but the second last is an issue due to the index being out of bounds. Seems like the issue has been fixed. I've up voted your code but just for my learning, could you please explain the idea behind your code? – Black Feb 25 '14 at 13:35