7

I have a coordinated storage list in python A[row,col,value] for storing non-zeros values.

How can I get the list of all the row indexes? I expected this A[0:][0] to work as print A[0:] prints the whole list but print A[0:][0] only prints A[0].

The reason I ask is for efficient calculation of the number of non-zero values in each row i.e iterating over range(0,n) where n is the total number of rows. This should be much cheaper than my current way of for i in range(0,n): for j in A: ....

Something like:

c = []
# for the total number of rows
for i in range(0,n):
     # get number of rows with only one entry in coordinate storage list
     if A[0:][0].count(i) == 1: c.append(i)                
return c

Over:

c = []
# for the total number of rows 
for i in range(0,n):
    # get the index and initialize the count to 0 
    c.append([i,0])
    # for every entry in coordinate storage list 
    for j in A:
        # if row index (A[:][0]) is equal to current row i, increment count  
        if j[0] == i:
           c[i][1]+=1
return c

EDIT:

Using Junuxx's answer, this question and this post I came up with the following (for returning the number of singleton rows) which is much faster for my current problems size of A than my original attempt. However it still grows with the number of rows and columns. I wonder if it's possible to not have to iterate over A but just upto n?

# get total list of row indexes from coordinate storage list
row_indexes = [i[0] for i in A]
# create dictionary {index:count}
c = Counter(row_indexes)    
# return only value where count == 1 
return [c[0] for c in c.items() if c[1] == 1]
Community
  • 1
  • 1
Chris Seymour
  • 83,387
  • 30
  • 160
  • 202
  • 1
    @larsman: I assume A is a list of triples. – Junuxx Oct 26 '12 at 09:54
  • 1
    Can you write a simple, inefficient, working example of what you are trying to do? I find the wording of the question really confusing, and none of your example code-blocks seem to do the same thing..? – dbr Oct 26 '12 at 11:40
  • I am calculating all rows that contain only 1 non-zero value from a coordinate storage list. Only the 2nd code block slightly differs as it returned the count for every row. I have update the code with comments. – Chris Seymour Oct 26 '12 at 12:26

2 Answers2

16

This should do it:

c = [x[0] for x in A]

It's a list comprehension that takes the first (sub-)element of every element of A.

Junuxx
  • 14,011
  • 5
  • 41
  • 71
  • This performs much better than my original solution. Please see my edit, is it possible not iterate over A though? Much appreciated! – Chris Seymour Oct 26 '12 at 10:38
  • If A is very large but the elements of A have only three members, it might be more efficient to store three lists, `rows`, `columns` and `values`. You'll be able to get all row numbers instantly, and can still access a single entry by using the same index for all three lists (they are aligned). If both A and the sublists are long, it might be better to use a true twodimensional data structure such as provided by numpy (see Jon Clements' answer) rather than nested lists. – Junuxx Oct 26 '12 at 11:30
4

For efficieny and extended slices, you can use numpy - which given your example seems like a good idea:

import numpy as np
yourlist = [
    [0, 0, 0],
    [0, 1, 1],
    [1, 0, 2]
]
a = np.array(yourlist)
print a[:,0]
# [0 0 1]
bc = np.bincount(a[:,0])
# array([2, 1])
count = bc[bc==1].size
# 1
# or... (I think it's probably better...)
count = np.count_nonzero(bc == 1)
Jon Clements
  • 138,671
  • 33
  • 247
  • 280
  • I can't get your example to work.. `type(mylist[0][0])` returns `int`, `type(a[0][0])` returns `numpy.float64` after `a = numpy.array(mylist)` when I try `bincount(a[:,0])` I get `TypeError: array cannot be safely cast to required type` I tried `bc = numpy.bincount(numpy.arange( a[:,0],dtype=numpy.int))` and the error is `TypeError: only length-1 arrays can be converted to Python scalars` – Chris Seymour Oct 26 '12 at 13:34
  • @sudo_o Not sure what to say about that - after `np.array` (not `np.arange`) I end up with `type(a[0][0])` and everything else just works... – Jon Clements Oct 26 '12 at 20:10