4

I have a list looking like this:

[2, 1, 3, 1, 2, 3, 1, 2, 2, 2]

What I want is a transition matrix which shows me the sequence like:

  • How often is a 1 followed by a 1
  • How often is a 1 followed by a 2
  • How often is a 1 followed by a 3

  • How often is a 2 followed by a 1

  • How often is a 2 followed by a 2
  • How often is a 2 followed by a 3

and so on...

((0,2,1), (1,2,1), (2,0,0))

Is there a premade module go get this?

Flo
  • 159
  • 2
  • 12
  • What is the second tuple? – user2963623 Aug 12 '14 at 16:31
  • How often is a 2 followed by a 1 How often is a 2 followed by a 2 How often is a 2 followed by a 3 – Flo Aug 12 '14 at 16:32
  • But a 2 is followed by a 1 one time in your list. Why is your tuple then (0, 2, 1)? Shouldn't it be (1,2,1)? (assuming the tuple is structured like `(number of occurrences, first number, number following the first number)`) – Kevin Aug 12 '14 at 16:33
  • @Kevin I think ahrf meant 2 followed by 1 from left to right. the sequence "1,2" repeats twice – Ashoka Lella Aug 12 '14 at 16:35
  • Is this a transition matrix in the same sense as a markov chain? http://en.wikipedia.org/wiki/Stochastic_matrix If so, perhaps you should make that clear. – Andrew Jaffe Aug 12 '14 at 16:39
  • @AndrewJaffe I don't think it is. A stochastic matrix has probabilities, and rows/columns/both sums to one – Korem Aug 12 '14 at 16:54

1 Answers1

13

I don't know if there's a module, but I'd go with this code, which is easily generalizeable:

import numpy as np
from collections import Counter
a = [2, 1, 3, 1, 2, 3, 1, 2, 2, 2]
b = np.zeros((3,3))
for (x,y), c in Counter(zip(a, a[1:])).iteritems():
    b[x-1,y-1] = c
print b
array([[ 0.,  2.,  1.],
       [ 1.,  2.,  1.],
       [ 2.,  0.,  0.]])

With no numpy installed:

b = [[0 for _ in xrange(3)] for _ in xrange(3)]
for (x,y), c in Counter(zip(a, a[1:])).iteritems():
    b[x-1][y-1] = c

print b
[[0, 2, 1], [1, 2, 1], [2, 0, 0]]

A few details of what's going on, if needed:

  1. zip(a, a[1:]) gets all the pairs of consecutive numbers.
  2. Counter counts how many times each pair appears
  3. The for loop simple converts the dictionary Counter produces into the matrix / list of lists you requested
Korem
  • 11,383
  • 7
  • 55
  • 72
  • What if we don't have `numpy` installed? What alternative is there in out-of-the-box Python? – Kevin Aug 12 '14 at 16:43
  • 1
    @Kevin: `[[0 for _ in range(n)] for _ in range(n)]` – georg Aug 12 '14 at 16:44
  • 1
    @Kevin updated with a no numpy version. But I really do recommend numpy. If there's a module-solution like you originally requested - it probably involves it. – Korem Aug 12 '14 at 16:46
  • I'm using python 3.x might that be the reason i get the error: 'Counter' object has no attribute 'iteritems'? – Flo Aug 12 '14 at 16:53
  • 2
    @ahrf: yep. You can use `.items()` instead. (And `range` instead of `xrange`, and fix the `print`s, of course.) – DSM Aug 12 '14 at 16:54
  • Thanks a lot to all of you, works pretty well! Especially for the detailed explanation of the code!! – Flo Aug 12 '14 at 16:58