-1

I have a list of floats, and I want to know how many duplicates are in it.

I have tried with this:

p = t_gw.p(sma, m1, m2)       #p is a 1d numpy array
p_list = list(p)
dup = set([x for x in p_list if p_list.count(x) > 1])
print dup

I have also tried to use collections.counter, but I always get the same error

TypeError: unhashable type: 'numpy.ndarray'

I've looked around in similar questions, but I can't understand what hashable means, why a list (or numpy array) is not hashable and what kind of type should I use.

Argentina
  • 1,071
  • 5
  • 16
  • 30
  • 1
    possible dublicate for http://stackoverflow.com/questions/9835762/find-and-list-duplicates-in-python-list – Ammar Nov 15 '14 at 10:42
  • you wrap one or more numpy arrays in a list then you make a list comprehension and then you wrap the remaining numpy arrays in a set. for wrapping in a set you must have hashable items. numpy arrays aren't. – NoDataDumpNoContribution Nov 15 '14 at 10:43
  • @unixer Not a duplicate, this one's numpy-specific. – simonzack Nov 15 '14 at 10:54

3 Answers3

2

Your numpy-array is two-dimensional. So list(p) does not do, what you expect. Use list(p.flat) instead.

Or (mis)use numpy's histogram function:

cnt, bins = numpy.histogram(p, bins=sorted(set(p.flat))+[float('inf')])
dup = bins[cnt>1]
Daniel
  • 42,087
  • 4
  • 55
  • 81
  • That was useful! I have only one thing to ask: if `p.shape = (1012, 1)` isn't p a 1d array? – Argentina Nov 15 '14 at 11:07
  • `shape` has two elements, therefore it is two-dimensional. `p.ravel().shape = (1012,)` is one-dimensional. – Daniel Nov 15 '14 at 11:21
  • Sorry, but I can't figure out in what are they different: I can only think at both of them as arrays with one column (!) – Argentina Nov 15 '14 at 11:43
  • In the first case, to get an element you have to write `p[i,0]`, with only one-dimension, `p[i]` is enough. – Daniel Nov 15 '14 at 12:08
-1

It depends what do you mean by number of duplicates.

An easy way to do this is to use hash:

h = {}
arr = [6, 3, 1, 1, 6, 2, 1]
for i in arr:
    if i in h:
        h[i] += 1
    else:
        h[i] =1

print h

Now if you mean that duplicates are the values that are used more then once in the list, you can do this with:

num = 0
for i in h:
    if h[i] > 1:
        num += 1

print num

I think that it is pretty easy to modify it to numpy.

Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
-1

you want to count something in a list ? why not use the count method of list object ?

number = my_list.count(my_float)
Ludovic Viaud
  • 202
  • 1
  • 5