2

I'm in the process of implementing a K-nearest neighbour algorithm in Python (for those of you that don't know about learning, it's an algorithm used to classify objects based on data that is already classified, using Euclidean distance).

I've got my distances computed, and I can take the k nearest distances, and find the classes of those objects. My problem is, if K is greater than 1, say 3 or 5, I'm not sure how I can get the most frequent element in the list.

For example, my output is:

[10, 9, 7, 10]

10 occurs the most, so I'd like to return this number. In case of a tie (2 or more elements occuring the same frequency), it returns an error (I can deal with this myself). I'd just like some opinion on how to return the maximum of the above list. (Using python 2.6.6 so I can't use the collections imports).

Second question:

I'm attempting to convert a numpy array to a normal array. My code looks like this:

def getClassesOfIndexes(l):
    tmp1 = []
    for i in l:
        tmp1.append(classes[i])
    return tmp1

print(getClassesOfIndexes([1024, 9128, 394, 39]))

This prints something like: [array([10], dtype=uint8), array([7], dtype=uint8), array([10], dtype=uint8), array([9], dtype=uint8)]

What could I do for it to simply return [10, 7, 10, 9]?

Thanks for any help.

Oliver W.
  • 13,169
  • 3
  • 37
  • 50

2 Answers2

1

Question 2 is the easier (though in the future, please post unrelated questions as two separate questions on SO). The tolist function automatically converts numpy arrays to regular lists http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tolist.html

Question 1 is also pretty straightforward. You say you want "the most frequent element in the list". Here's a complete discussion Python most common element in a list. One solution is to map each element to a dictionary of its frequency, and then grab the key corresponding to the largest value in the map. This might look like

...
freq_map = {my_list.count(val):val for val in set(my_list)}
return freq_map[max(freq_map.keys())]
Community
  • 1
  • 1
en_Knight
  • 5,301
  • 2
  • 26
  • 46
  • Hi. Sorry about that, it said to wait 90 minutes to post again. I managed to get question 2 a few mins after posting. Question 1 I'm still unsure, I'd like to return the highest number in an array, and return multiple numbers in an array if there are numbers with the same frequency. –  Mar 20 '15 at 00:54
  • @Tazman You're saying two different things. The "frequency" of a number is how many times it occurs in the array. The highest number is the number with the biggest value. The code I posted and discussion I linked deal with frequency; highest number is just the 'max' function. For example, in the list [1,1,2], the number with the highest frequency is 1, the highest number is 2 – en_Knight Mar 20 '15 at 00:56
  • Sorry, it's late and I've been coding all day. I want the function to return from a list of arrays, the number which appears the most in an array. In case of a tie, return both (or more) numbers in an array. E.g. [10,9,10,3,2] = [10], and [10,9,9,10,2,1] = [10, 9] –  Mar 20 '15 at 01:07
  • The code I've provided, and the link I've provided, do exactly that. – en_Knight Mar 20 '15 at 01:38
0

Following-up on the comments given in en_Knight's good answer (upvoted for the reference to the existing threads, but beware the dict comprehension does not work under Python2.6.6!): if you want a list of the most common elements (so those with equal frequency), you could do the following:

>>> arr = [10,9,9,7,10]
>>> counter = {}
>>> for elm in arr:
...     try:
...         counter[elm] += 1
...     except KeyError:
...         counter[elm] = 1
... 
>>> counter
{9: 2, 10: 2, 7: 1}
>>> srt = sorted(counter.items(), key=lambda x: x[1], reverse=True)
>>> maxitem, maxcount = srt[0]
>>> most_frequents = [maxitem]
>>> for rec in srt[1:]:
...     if rec[1] == maxcount:
...         most_frequents.append(rec[0])
...     else:
...         break
... 
>>> most_frequents
[9, 10]

Tested under Python2.6.6.

Oliver W.
  • 13,169
  • 3
  • 37
  • 50
  • @Tazman, glad to have helped. Don't forget to take a rest if you've been coding all day. ;-) – Oliver W. Mar 20 '15 at 01:20
  • Yep, I just get too carried away and can't do anything else until I've solved a problem. I'm that kind of guy, probably not healthy. Thanks again for the help. –  Mar 20 '15 at 02:12