0

I'm searching for a way to count unique digits efficiently with a one liner.

For example: given the integer 623562, return value would be 4.

My current way of doing it is, given integer i, im using len(set(str(i))). Creating a set is time consuming. I'm going over alot of numbers so I need an efficient way.

Also, if someone can find a way to go over all the numbers with x digits without using range() (and in a one liner..), I'll be glad. Memory limits me when using range because a list is created (I assume).

user1413824
  • 659
  • 1
  • 8
  • 15

3 Answers3

11

sets are optimized for this creation. Unless you want to roll out your own decimal-to-string conversion (and that would take more than one line), it's the way to go.

range only allocates memory in Python 2.x. For small numbers like 623562, the memory shouldn't be a problem. For larger numbers, use xrange in Python 2.x or simply switch to Python 3.x, where range generates the numbers just in time.

phihag
  • 278,196
  • 72
  • 453
  • 469
  • Thanks, im using python 2.7 so xrange does help, since range can be numbers with even 9 digits. however, in high ranges, python says that the int is too large, im starting to wonder if there is a way going over the number treating them as strings somehow. about set, dont forget my goal, im searching for any way to count the unique digits in a given number, if you have any way which does not include decimal-to-string convertion, it would still suffice. – user1413824 May 25 '12 at 10:22
  • You can use a drop-in to get [xrange for large numbers](http://stackoverflow.com/questions/1482480/xrange2100-overflowerror-long-int-too-large-to-convert-to-int). No offense, but your requirements (single line, Python, but speed at all costs) seem somewhat) confused. If performance is really an issue, you should think about writing an optimized C program that does exactly what you want. It will be more than a single line, but why is that critical? – phihag May 25 '12 at 10:55
  • It's a part of a project, i need it that way. I still have another list size problem: I need to sum for all the numbers in the range, the count of unique digits for each one, so im creating a list using list comprehensions and then sum it. Is there a way of summing these counts on the fly, without creating a list? What im currently doing given x (number of digits) is: sum([len(set(str(i))) for i in xrange(10**(x-1),10**x)]) – user1413824 May 25 '12 at 11:15
  • 1
    You can simply lose the brackets. That will make your [list comprehension](http://docs.python.org/tutorial/datastructures.html#list-comprehensions) a [generator expression](http://docs.python.org/reference/expressions.html#generator-expressions), which does not require memory upfront. Your project requirements are really strange. Why does your client insist on a one-liner? If it's homework or codegolf, you should [tag](http://stackoverflow.com/questions/tagged/homework) it as such. – phihag May 25 '12 at 12:30
2

I'm having a hard time believing that len(set(str(num))) isn't fast enough for you. Here is a test that does len(set(str())) of a random, very large number 100,000 times:

% python -m timeit -s 'import random' 'for i in range(100000): \
  len(set(str(random.randint(199123212312399956789, 1000000099999999123091230000000))))'
10 loops, best of 3: 456 msec per loop

And a decent chunk of that time is just generating the random numbers! If you really need to go faster than that, I'd think you should consider an alternative language.

Nolen Royalty
  • 18,415
  • 4
  • 40
  • 50
1

Here's a way that avoids creating a set each time. All but the last line are initialisation code so only happen once:

>>> from operator import or_
>>> from collections import Counter
>>> from functools import reduce
>>> bits = {str(i):2**i for i in range(10)}
>>> counts = [Counter(format(i,'b'))['1'] for i in range(2**10)]

>>> counts[reduce(or_, (bits[c] for c in str(623562)))]
4

However, it is about 3 times slower than the simple, clear, obvious len(set(str(i))). As usual in Python making things more complicated or trying to be excessively clever will come back and bite you on performance.

Duncan
  • 92,073
  • 11
  • 122
  • 156