5

For example, I have this 2d array:

[
    [
     0.0,
     0.24320757858085434,
     0.14893361727523413,
     0.29786723455046826,
     0.18838778030301612,
     0.12160378929042717
    ],
    [
     0.23717478210768014,
     0.0,
     0.16770789675478251,
     0.20539938644228997,
     0.25981195646349819,
     0.1299059782317491
    ],
    [
     0.21681956134183847,
     0.250361664212574,
     0.0,
     0.23178986094050727,
     0.16390018248131957,
     0.13712873102376066
    ],
    [
     0.2933749527592357,
     0.20744741852633861,
     0.15681550844086434,
     0.0,
     0.18554661183269694,
     0.15681550844086434
    ],
    [
     0.20305810393286577,
     0.28716752453162431,
     0.12135042758887897,
     0.20305810393286577,
     0.0,
     0.18536584001376513
    ],
    [
     0.17877693623386351,
     0.19584032147389943,
     0.13848001934394774,
     0.23407395508684939,
     0.25282876786143976,
     0.0
    ]
]

which gives sets of probabilities. How can I find the best probability of each row? And also is there any way to find for example the 2nd, 3rd best probability without changing the elements' positions?

jpp
  • 159,742
  • 34
  • 281
  • 339
btloseltwin
  • 49
  • 1
  • 4
  • Possible duplicate of [How to get indices of a sorted array in Python](https://stackoverflow.com/questions/6422700/how-to-get-indices-of-a-sorted-array-in-python) – wwii Apr 15 '18 at 13:53
  • This one is better: [How to find k biggest numbers from a list of n numbers assuming n > k](https://stackoverflow.com/q/17906949/2823755) - the accepted answer (`heapq` solution) looks promising. – wwii Apr 15 '18 at 14:15
  • Did one of the below solutions help? If so, please consider accepting (green tick on left), so other users know. – jpp Apr 17 '18 at 09:31

3 Answers3

6

You can do this easily with 3rd party library numpy. First create a numpy array:

A = np.array([[0.0, 0.24320757858085434, 0.14893361727523413, 0.29786723455046826, 0.18838778030301612, 0.12160378929042717], [0.23717478210768014, 0.0, 0.16770789675478251, 0.20539938644228997, 0.25981195646349819, 0.1299059782317491], [0.21681956134183847, 0.250361664212574, 0.0, 0.23178986094050727, 0.16390018248131957, 0.13712873102376066], [0.2933749527592357, 0.20744741852633861, 0.15681550844086434, 0.0, 0.18554661183269694, 0.15681550844086434], [0.20305810393286577, 0.28716752453162431, 0.12135042758887897, 0.20305810393286577, 0.0, 0.18536584001376513], [0.17877693623386351, 0.19584032147389943, 0.13848001934394774, 0.23407395508684939, 0.25282876786143976, 0.0]])

To return the maximum of each row:

res = A.max(axis=1)

For the second largest in each row, you can use numpy.sort. This sorts along an axis (not in place) and then extracts the 2nd largest (via -2).

res = np.sort(A, axis=1)[:, -2]

These are both vectorised calculations. You can perform these calculations using lists of lists, but this is inadvisable.

jpp
  • 159,742
  • 34
  • 281
  • 339
  • Thank you! What do you suggest using instead of lists of lists? Tupples? – btloseltwin Apr 15 '18 at 13:47
  • 1
    `numpy` array, I showed in the first line of code how you can convert a list of lists into a `numpy` array. In this format, you can perform array-based calculations conveniently in a vectorised way. – jpp Apr 15 '18 at 13:48
  • [link](https://i.imgur.com/fAkrbgJ.jpg) can u see why for example in the first row it gives me the 3rd bigger instead of the second? – btloseltwin Apr 15 '18 at 14:55
3

@jpp's numpy solution is probably the way to go, for the reasons they gave, but if you wanted to do it from pure python, you could do the following:

#Get the maximum value for each list

[[max(i)] for i in my_list]

# [[0.29786723455046826], [0.2598119564634982], [0.250361664212574], 
# [0.2933749527592357], [0.2871675245316243], [0.25282876786143976]]

# Get the maximum 2 values for each list:

[sorted(i)[-2:] for i in my_list]

# Get the maximum 3 values for each list:

[sorted(i)[-3:] for i in my_list]

And so on. Note that this will not reorder the original list, as the sorting is occurring in the sublists being created in the list comprehension

sacuL
  • 49,704
  • 8
  • 81
  • 106
0

You can first sort each row in descending order, then select first or second largest elements depending on your need.

a = [
    [
     0.0,
     0.24320757858085434,
     0.14893361727523413,
     0.29786723455046826,
     0.18838778030301612,
     0.12160378929042717
    ],
    [
     0.23717478210768014,
     0.0,
     0.16770789675478251,
     0.20539938644228997,
     0.25981195646349819,
     0.1299059782317491
    ],
    [
     0.21681956134183847,
     0.250361664212574,
     0.0,
     0.23178986094050727,
     0.16390018248131957,
     0.13712873102376066
    ]
]

for i in range(0, len(a)):
    a[i].sort(reverse=True)

print "1st Largests:"
for row in a:
    print "\t" + str(row[0])

print "2nd Largests:"
for row in a:
    print "\t" + str(row[1])

PS: if you are worrying about efficiency, then what you need to look for is partitioning. Lomuto and Hoare partition schemes are two famous ones.

eneski
  • 1,575
  • 17
  • 40