1

I'm trying to make histogram by python. I am starting with the following snippet:

def histogram(L):
    d = {}
    for x in L:
        if x in d:
            d[x] += 1
        else:
            d[x] = 1
    return d

I understand it's using dictionary function to solve the problem. But I'm just confused about the 4th line: if x in d:

d is to be constructed, there's nothing in d yet, so how come if x in d?

wap26
  • 2,180
  • 1
  • 17
  • 32
LookIntoEast
  • 8,048
  • 18
  • 64
  • 92
  • If you look for a histogram then use the histogram functions from numpy/scipy or matplotlib. Libraries are great! – joaquin Aug 17 '11 at 19:38

7 Answers7

5

Keep in mind, that if is inside a for loop.

So, when you're looking at the very first item in L there is nothing in d, but when you get to the next item in L, there is something in d, so you need to check whether to make a new bin on the histogram (d[x] = 1), or add the item to an existing bin (d[x] += 1).

In Python, we actually have some shortcuts for this:

from collections import defaultdict

def histogram(L):
    d = defaultdict(int)
    for x in L:
        d[x] += 1
return d

This automatically starts each bin in d at zero (what int() returns) so you don't have to check if the bin exists. On Python 2.7 or higher:

from collections import Counter

d = Counter(L)

Will automatically make a mapping of the frequencies of each item in L. No other code required.

agf
  • 171,228
  • 44
  • 289
  • 238
2

You can create a histogram with a dict comprehension:

histogram = {key: l.count(key) for key in set(L)}
dansalmo
  • 11,506
  • 5
  • 58
  • 53
2

The code inside of the for loop will be executed once for each element in L, with x being the value of the current element.

Lets look at the simple case where L is the list [3, 3]. The first time through the loop d will be empty, x will be 3, and 3 in d will be false, so d[3] will be set to 1. The next time through the loop x will be 3 again, and 3 in d will be true, so d[3] will be incremented by 1.

Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
1

You can use a Counter, available from Python 2.7 and Python 3.1+.

>>> # init empty counter
>>> from collections import Counter
>>> c = Counter()

>>> # add a single sample to the histogram
>>> c.update([4])
>>> # add several samples at once
>>> c.update([4, 2, 2, 5])

>>> # print content
>>> print c

Counter({2: 2, 4: 2, 5: 1})

The module brings several nice features, like addition, subtraction, intersection and union on counters. The Counter can count anything which can be used as a dictionary key.

wap26
  • 2,180
  • 1
  • 17
  • 32
1

I think the other guys have explained you why if x in d. But here is a clue, how this code should be written following "don't ask permission, ask forgiveness":

    ...
    try:
        d[x] += 1
    except KeyError:
        d[x] = 1

The reason for this, is that you expect this error to appear only once (at least once per method call). Thus, there is no need to check if x in d.

Zaur Nasibov
  • 22,280
  • 12
  • 56
  • 83
  • 1
    `except` is much slower than `if`, but `try` is faster -- so if `if x not in d` is going to happen fairly frequently, it's better to ask permission. When the `KeyError` is truly the __except__ ional case, then you should use `try` / `except` and just ask forgiveness. – agf Aug 17 '11 at 19:26
0

You can create your own histogram in Python using for example matplotlib. If you want to see one example about how this could be implemented, you can refer to this answer.

enter image description here

In this specific case, you can use doing:

temperature = [4,   3,   1,   4,   6,   7,   8,   3,   1]
radius      = [0,   2,   3,   4,   0,   1,   2,  10,   7]
density     = [1,  10,   2,  24,   7,  10,  21, 102, 203]

points, sub = hist3d_bubble(temperature, density, radius, bins=4)
sub.axes.set_xlabel('temperature')
sub.axes.set_ylabel('density')
sub.axes.set_zlabel('radius')
Community
  • 1
  • 1
Saullo G. P. Castro
  • 56,802
  • 26
  • 179
  • 234
0

if x isn't in d, then it gets put into d with d[x] = 1. Basically, if x shows up in d more than once it increases the number matched with x.

Try using this to step through the code: http://people.csail.mit.edu/pgbovine/python/

John
  • 2,395
  • 15
  • 21