Short (and useful) python snippets

Question

In spirit of the existing "what's your most useful C/C++ snippet" - thread:

Do you guys have short, monofunctional Python snippets that you use (often) and would like to share with the StackOverlow Community? Please keep the entries small (under 25 lines maybe?) and give only one example per post.

I'll start of with a short snippet i use from time to time to count sloc (source lines of code) in python projects:

# prints recursive count of lines of python source code from current directory
# includes an ignore_list. also prints total sloc

import os
cur_path = os.getcwd()
ignore_set = set(["__init__.py", "count_sourcelines.py"])

loclist = []

for pydir, _, pyfiles in os.walk(cur_path):
    for pyfile in pyfiles:
        if pyfile.endswith(".py") and pyfile not in ignore_set:
            totalpath = os.path.join(pydir, pyfile)
            loclist.append( ( len(open(totalpath, "r").read().splitlines()),
                               totalpath.split(cur_path)[1]) )

for linenumbercount, filename in loclist: 
    print "%05d lines in %s" % (linenumbercount, filename)

print "\nTotal: %s lines (%s)" %(sum([x[0] for x in loclist]), cur_path)

The Python Cookbook (http://code.activestate.com/recipes/langs/python/) is a much better resource for this. Examples, commentary, comments, and available online and in book form. Also, your example is a maintenance horror and "%05d" % ln is better than "%s" % (str(len).zfill(5)). — Andrew Dalke, Mar 28 '09 at 02:00
Examples of "horror":1) m.split(curpath)[1] fails if cur_path is "/home/dalke" and m is "/home/dalke/subdir/home/dalke/whatever". 2) the list() isn't needed. 3) 'for b,zn in [(r,f) for ...]' can be reduced to 'for b,ignore,zn in os.walk(cur_path). Oh, and 4) newlines and indentation help readability — Andrew Dalke, Mar 28 '09 at 02:08
also, suggest using a set for the ignore list. this isn't a performance sensitive app, but no reason not to take advantage of hashes for lookups. — daniel, Mar 28 '09 at 02:53

score 37 · Answer 1 · answered Mar 29 '09 at 05:57

37

I like using any and a generator:

if any(pred(x.item) for x in sequence):
    ...

instead of code written like this:

found = False
for x in sequence:
    if pred(x.n):
        found = True
if found:
    ...

I first learned of this technique from a Peter Norvig article.

answered Mar 29 '09 at 05:57

Jacob Gabrielson

34,800
15
46
64

2

+1 For the reference to Norvig's sudoku article. It's very nice. – Stephan202 Jul 12 '09 at 13:39
3

There's also all() to check that all items are True. – FogleBird Jul 12 '09 at 13:51

score 23 · Accepted Answer · answered Mar 28 '09 at 08:35

Initializing a 2D list

While this can be done safely to initialize a list:

lst = [0] * 3

The same trick won’t work for a 2D list (list of lists):

>>> lst_2d = [[0] * 3] * 3
>>> lst_2d
[[0, 0, 0], [0, 0, 0], [0, 0, 0]]
>>> lst_2d[0][0] = 5
>>> lst_2d
[[5, 0, 0], [5, 0, 0], [5, 0, 0]]

The operator * duplicates its operands, and duplicated lists constructed with [] point to the same list. The correct way to do this is:

>>> lst_2d = [[0] * 3 for i in xrange(3)]
>>> lst_2d
[[0, 0, 0], [0, 0, 0], [0, 0, 0]]
>>> lst_2d[0][0] = 5
>>> lst_2d
[[5, 0, 0], [0, 0, 0], [0, 0, 0]]

Faster way: http://stackoverflow.com/questions/2332919 – John La Rooy Feb 25 '10 at 09:56 — John La Rooy, Feb 25 '10 at 09:56

score 22 · Answer 3 · answered Mar 28 '09 at 02:16

22

The only 'trick' I know that really wowed me when I learned it is enumerate. It allows you to have access to the indexes of the elements within a for loop.

>>> l = ['a','b','c','d','e','f']
>>> for (index,value) in enumerate(l):
...     print index, value
... 
0 a
1 b
2 c
3 d
4 e
5 f

answered Mar 28 '09 at 02:16

theycallmemorty

12,515
14
51
71

19

No need to put index, value in parentheses. Also, the above comment is naive/ignorant. – FogleBird Jul 12 '09 at 13:50

score 18 · Answer 4 · answered Mar 28 '09 at 18:11

18

zip(*iterable) transposes an iterable.

>>> a=[[1,2,3],[4,5,6]]
>>> zip(*a)
    [(1, 4), (2, 5), (3, 6)]

It's also useful with dicts.

>>> d={"a":1,"b":2,"c":3}
>>> zip(*d.iteritems())
[('a', 'c', 'b'), (1, 3, 2)]

answered Mar 28 '09 at 18:11

AKX

152,115
15
115
172

I loved this when I first found it, I think of it as "unzipping", lol. But I didn't know about dictionaries though. Thanks. – jacktrader Apr 25 '19 at 16:05

score 16 · Answer 5 · answered Sep 16 '11 at 15:40

16

Fire up a simple web server for files in the current directory:

python -m SimpleHTTPServer

Useful for sharing files.

answered Sep 16 '11 at 15:40

Adam Lehenbauer

309
4
12

`python -m SimpleHTTPServer 8008` to serve on port 8008 – Alois Mahdal Dec 02 '16 at 03:31

Theodor · Answer 6 · 2011-06-28T12:45:59.037

14

A "progress bar" that looks like:

|#############################---------------------|
59 percent done

Code:

class ProgressBar():
    def __init__(self, width=50):
        self.pointer = 0
        self.width = width

    def __call__(self,x):
         # x in percent
         self.pointer = int(self.width*(x/100.0))
         return "|" + "#"*self.pointer + "-"*(self.width-self.pointer)+\
                "|\n %d percent done" % int(x)

Test function (for windows system, change "clear" into "CLS"):

if __name__ == '__main__':
    import time, os
    pb = ProgressBar()
    for i in range(101):
        os.system('clear')
        print pb(i)
        time.sleep(0.1)

edited Jun 28 '11 at 12:45

answered Dec 13 '10 at 16:27

Theodor

5,536
15
41
55

how do i use this code? – newGIS Jun 09 '16 at 12:58
2

now we have `tqdm`, `from tqdm import tqdm` & `for i in tqdm([1,2,3]): print i` – kamalbanga Aug 23 '18 at 12:59

score 11 · Answer 7 · answered Mar 29 '09 at 01:47

11

To flatten a list of lists, such as

[['a', 'b'], ['c'], ['d', 'e', 'f']]

into

['a', 'b', 'c', 'd', 'e', 'f']

use

[inner
    for outer in the_list
        for inner in outer]

answered Mar 29 '09 at 01:47

George V. Reilly

15,885
7
43
38

4

Or `sum(the_list, [])`. Although I suspect this is going to go very wrong somewhere (aside from generators, of course). – HoverHell Mar 10 '12 at 12:48
@HoverHell, some people may argue with it being perhaps "proper" but I've used this method for a little while and love it. Best! – jacktrader Apr 25 '19 at 16:07

score 10 · Answer 8 · answered Mar 28 '09 at 21:36

10

Huge speedup for nested list and dictionaries with:

deepcopy = lambda x: cPickle.loads(cPickle.dumps(x))

answered Mar 28 '09 at 21:36

vartec

131,205
36
218
244

1

I've always been leery of this technique, although it seems like it should work about as fast as anything else I could think of. do pythonistas consider this a good way to get deep copies? (fwiw, i use this technique anyway) – SingleNegationElimination Jul 12 '09 at 15:37

score 8 · Answer 9 · answered Mar 28 '09 at 08:36

8

Suppose you have a list of items, and you want a dictionary with these items as the keys. Use fromkeys:

>>> items = ['a', 'b', 'c', 'd']
>>> idict = dict().fromkeys(items, 0)
>>> idict
{'a': 0, 'c': 0, 'b': 0, 'd': 0}
>>>

The second argument of fromkeys is the value to be granted to all the newly created keys.

answered Mar 28 '09 at 08:36

Eli Bendersky

263,248
89
350
412

1

fromkeys is a static method. You should do "dict.fromkeys(items, 0)". Your code creates and throws away an empty dictionary. – Andrew Dalke Mar 28 '09 at 19:39
1

@Andrew Dalke , I believe `dict.fromkeys` is a class-method. reason: `dict.fromkeys` returns a dictionary back, hence it _should_ get `class` as its first argument. Think about when you've subclassed `dict` -- `MyDict.fromkeys` should give an instance of `MyDict` – Jeffrey Jose Mar 14 '10 at 18:41
@jeffjose: You are correct. I did the test you suggested and looked at the code since I was curious how that was done. – Andrew Dalke Mar 17 '10 at 14:53

score 7 · Answer 10 · answered Mar 28 '09 at 08:35

7

To find out if line is empty (i.e. either size 0 or contains only whitespace), use the string method strip in a condition, as follows:

if not line.strip():    # if line is empty
    continue            # skip it

answered Mar 28 '09 at 08:35

Eli Bendersky

263,248
89
350
412

score 5 · Answer 11 · answered Mar 28 '09 at 01:13

5

I like this one to zip everything up in a directory. Hotkey it for instabackups!

import zipfile

z = zipfile.ZipFile('my-archive.zip', 'w', zipfile.ZIP_DEFLATED)
startdir = "/home/johnf"
for dirpath, dirnames, filenames in os.walk(startdir):
  for filename in filenames:
    z.write(os.path.join(dirpath, filename))
z.close()

answered Mar 28 '09 at 01:13

John Feminella

303,634
46
339
357

2

What's wrong with zip -r my-archive.zip directory/ ? – rmmh Mar 28 '09 at 01:14
That's not a Python snippet. :) (Also, you can include special logic in the snippet that might be complicated to do with shell commands.) – John Feminella Mar 28 '09 at 01:18
+1 (althogh I prefer `tarfile`) – David X Aug 04 '10 at 18:17

score 5 · Answer 12 · answered Mar 28 '09 at 21:32

5

For list comprehensions that need current, next:

[fun(curr,next) 
 for curr,next 
 in zip(list,list[1:].append(None)) 
 if condition(curr,next)]

For circular list zip(list,list[1:].append(list[0])).

For previous, current: zip([None].extend(list[:-1]),list) circular: zip([list[-1]].extend(list[:-1]),list)

answered Mar 28 '09 at 21:32

vartec

131,205
36
218
244

A (slight adjustment of) the `pairwise` recipe does the same, and works for all iterables: http://docs.python.org/3.0/library/itertools.html#recipes – Stephan202 Jul 12 '09 at 13:51

score 4 · Answer 13 · answered Mar 28 '09 at 01:10

Hardlink identical files in current directory (on unix, this means they have share physical storage, meaning much less space):

import os
import hashlib

dupes = {}

for path, dirs, files in os.walk(os.getcwd()):
    for file in files:
        filename = os.path.join(path, file)
        hash = hashlib.sha1(open(filename).read()).hexdigest()
        if hash in dupes:
            print 'linking "%s" -> "%s"' % (dupes[hash], filename)
            os.rename(filename, filename + '.bak')
            try:
                os.link(dupes[hash], filename)
                os.unlink(filename + '.bak')
            except:
                os.rename(filename + '.bak', filename)
            finally:
        else:
            dupes[hash] = filename

+1, though this code can be improved upon, by using a unique temporary filename, instead of blindly assuming `filename.bak` doesn't exists. — Stephan202, Jul 12 '09 at 13:47

score 3 · Answer 14 · answered Apr 18 '11 at 19:15

3

Here are few which I think are worth knowing but might not be useful on an everyday basis. Most of them are one liners.

Removing Duplicates from a List

L = list(set(L))

Getting Integers from a string (space seperated)

ints = [int(x) for x in S.split()]

Finding Factorial

fac=lambda(n):reduce(int.__mul__,range(1,n+1),1)

Finding greatest common divisor

>>> def gcd(a,b):
...     while(b):a,b=b,a%b
...     return a

answered Apr 18 '11 at 19:15

jack_carver

1,510
2
13
28

How can you be sure that set(L) doesn't mess with the order of the original list? sets are orderless – GabiMe May 03 '11 at 12:54
3

Yes, it probably does, but how WOULD you remove duplicates from a list without messing with the order? That's not a very well-defined question. – weronika Sep 07 '11 at 06:18
To maintain sequence, you might need to do it in a program (sorry about the formatting, but this is a comment): new_list=[]; for x in old_list: if x not in new_list: new_list.append(x) – RufusVS Sep 02 '18 at 03:02

GabiMe · Answer 15 · 2011-04-27T14:18:01.873

2

Emulating a switch statement. For example switch(x) {..}:

def a():
  print "a"

def b():
  print "b"

def default():
   print "default"

apply({1:a, 2:b}.get(x, default))

edited Apr 27 '11 at 14:18

answered Apr 17 '11 at 14:38

GabiMe

18,105
28
76
113

score 2 · Answer 16 · answered Apr 17 '11 at 16:41

like another person above, I said 'Wooww !!' when I discovered enumerate()
I sang a praise to Python when I discovered repr() that gave me possibility to see precisely the content of strings that I wanted to analyse with a regex
I was very satisfied to discover that print '\n'.join(list_of_strings) is displayed much more rapidly with '\n'.join(...) than for ch in list_of_strings: print ch
splitlines(1) with an argument keeps the newlines

These four "tricks" combined in one snippet very useful to rapidly display the code source of a web page , line after line, each line being numbered , all the special characters like '\t' or newlines being not interpreted, and with the newlines present:

import urllib
from time import clock,sleep

sock = urllib.urlopen('http://docs.python.org/')
ch = sock.read()
sock.close()


te = clock()
for i,line in enumerate(ch.splitlines(1)):
    print str(i) + ' ' + repr(line)
t1 = clock() - te


print "\n\nIn 3 seconds, I will print the same content, using '\\n'.join(....)\n" 

sleep(3)

te = clock()
# here's the point of interest:
print '\n'.join(str(i) + ' ' + repr(line)
                for i,line in enumerate(ch.splitlines(1)) )
t2 = clock() - te

print '\n'
print 'first  display took',t1,'seconds'
print 'second display took',t2,'seconds'

With my not very fast computer, I got:

first  display took 4.94626048841 seconds
second display took 0.109297410704 seconds

score 2 · Answer 17 · answered Aug 18 '11 at 00:28

import tempfile
import cPickle

class DiskFifo:
    """A disk based FIFO which can be iterated, appended and extended in an interleaved way"""
    def __init__(self):
        self.fd = tempfile.TemporaryFile()
        self.wpos = 0
        self.rpos = 0
        self.pickler = cPickle.Pickler(self.fd)
        self.unpickler = cPickle.Unpickler(self.fd)
        self.size = 0

    def __len__(self):
        return self.size

    def extend(self, sequence):
        map(self.append, sequence)

    def append(self, x):
        self.fd.seek(self.wpos)
        self.pickler.clear_memo()
        self.pickler.dump(x)
        self.wpos = self.fd.tell()
        self.size = self.size + 1

    def next(self):
        try:
            self.fd.seek(self.rpos)
            x = self.unpickler.load()
            self.rpos = self.fd.tell()
            return x

        except EOFError:
            raise StopIteration

    def __iter__(self):
        self.rpos = 0
        return self

score 1 · Answer 18 · answered Jul 12 '09 at 13:23

A custom list that when multiplied by other list returns a cartesian product... the good thing is that the cartesian product is indexable, not like that of itertools.product (but the multiplicands must be sequences, not iterators).

import operator

class mylist(list):
    def __getitem__(self, args):
        if type(args) is tuple:
            return [list.__getitem__(self, i) for i in args]
        else:
            return list.__getitem__(self, args)
    def __mul__(self, args):
        seqattrs = ("__getitem__", "__iter__", "__len__")
        if all(hasattr(args, i) for i in seqattrs):
            return cartesian_product(self, args)
        else:
            return list.__mul__(self, args)
    def __imul__(self, args):
        return __mul__(self, args)
    def __rmul__(self, args):
        return __mul__(args, self)
    def __pow__(self, n):
        return cartesian_product(*((self,)*n))
    def __rpow__(self, n):
        return cartesian_product(*((self,)*n))

class cartesian_product:
    def __init__(self, *args):
        self.elements = args
    def __len__(self):
        return reduce(operator.mul, map(len, self.elements))
    def __getitem__(self, n):
        return [e[i] for e, i  in zip(self.elements,self.get_indices(n))]
    def get_indices(self, n):
        sizes = map(len, self.elements)
        tmp = [0]*len(sizes)
        i = -1
        for w in reversed(sizes):
            tmp[i] = n % w
            n /= w
            i -= 1
        return tmp
    def __add__(self, arg):
        return mylist(map(None, self)+mylist(map(None, arg)))
    def __imul__(self, args):
        return mylist(self)*mylist(args)
    def __rmul__(self, args):
        return mylist(args)*mylist(self)
    def __mul__(self, args):
        if isinstance(args, cartesian_product):
            return cartesian_product(*(self.elements+args.elements))
        else:
            return cartesian_product(*(self.elements+(args,)))
    def __iter__(self):
        for i in xrange(len(self)):
            yield self[i]
    def __str__(self):
        return "[" + ",".join(str(i) for i in self) +"]"
    def __repr__(self):
        return "*".join(map(repr, self.elements))

I don't understand a line of it, can you comment on how it works? — bodacydo, Mar 23 '10 at 14:45

score 1 · Answer 19 · answered Aug 07 '12 at 08:30

Iterate over any iterable (list, set, file, stream, strings, whatever), of ANY size (including unknown size), by chunks of x elements:

from itertools import chain, islice

def chunks(iterable, size, format=iter):
    it = iter(iterable)
    while True:
        yield format(chain((it.next(),), islice(it, size - 1)))

>>> l = ["a", "b", "c", "d", "e", "f", "g"]
>>> for chunk in chunks(l, 3, tuple):
...         print chunk
...     
("a", "b", "c")
("d", "e", "f")
("g",)

score 1 · Answer 20 · answered Mar 29 '09 at 10:46

1

For Python 2.4+ or earlier:

for x,y in someIterator:
  listDict.setdefault(x,[]).append(y)

In Python 2.5+ there is alternative using defaultdict.

answered Mar 29 '09 at 10:46

vartec

131,205
36
218
244

Josh Russo · Answer 21 · 2012-02-25T21:20:15.190

I actually just created this, but I think it's going to be a very useful debugging tool.

def dirValues(instance, all=False):
    retVal = {}
    for prop in dir(instance):
        if not all and prop[1] == "_":
            continue
        retVal[prop] = getattr(instance, prop)
    return retVal

I usually use dir() in a pdb context, but I think this will be much more useful:

(pdb) from pprint import pprint as pp
(pdb) from myUtils import dirValues
(pdb) pp(dirValues(someInstance))

score 0 · Answer 22 · answered Oct 07 '12 at 15:13

When debugging, you sometimes want to see a string with a basic editor. For showing a string with notepad:

import os, tempfile, subprocess

def get_rand_filename(dir_=os.getcwd()):
    "Function returns a non-existent random filename."
    return tempfile.mkstemp('.tmp', '', dir_)[1]

def open_with_notepad(s):
    "Function gets a string and shows it on notepad"
    with open(get_rand_filename(), 'w') as f:
        f.write(s)
        subprocess.Popen(['notepad', f.name])

http://www.teachyourselfpython.com has a very large and growing library of code snippets — Compoot, Jul 22 '17 at 12:32

Short (and useful) python snippets

22 Answers22

Linked