31

I am using the LSTM tutorial for Theano (http://deeplearning.net/tutorial/lstm.html). In the lstm.py (http://deeplearning.net/tutorial/code/lstm.py) file, I don't understand the following line:

c = m_[:, None] * c + (1. - m_)[:, None] * c_

What does m_[:, None] mean? In this case m_ is the theano vector while c is a matrix.

ForceBru
  • 43,482
  • 10
  • 63
  • 98
nisace
  • 463
  • 1
  • 4
  • 11
  • I haven't worked with Theano, but it seems it has tight integration with NumPy, which introduces the syntax you are dealing with: http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html – Ondrej Slinták Jul 18 '15 at 15:43
  • 2
    Ondrej Slinták is correct. Looking over code in https://github.com/Theano/Theano, Theano tensors are implemented as NumPy arrays and lstm.py shows that slice() returns a 2 or 3 dimensional NumPy array. With NumPy.array slicing, None is the same as the newaxis object which adds an axis (dimension) to an array so m_[:, None] wraps each element of m_ in an array, e.g. given import numpy as np; a = np.array([[1,2],[3,4]]), then a[:,None] is np.array([[[1, 2]], [[3, 4]]]) –  Jul 18 '15 at 17:25
  • Make sure to do a basic numpy tutorial followed by the Theano tutorial. That will answer a lot of questions. – eickenberg Jul 19 '15 at 07:21

2 Answers2

24

This question has been asked and answered on the Theano mailing list, but is actually about the basics of numpy indexing.

Here are the question and answer https://groups.google.com/forum/#!topic/theano-users/jq92vNtkYUI

For completeness, here is another explanation: slicing with None adds an axis to your array, see the relevant numpy documentation, because it behaves the same in both numpy and Theano:

http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#numpy.newaxis

Note that np.newaxis is None:

import numpy as np
a = np.arange(30).reshape(5, 6)

print a.shape  # yields (5, 6)
print a[np.newaxis, :, :].shape  # yields (1, 5, 6)
print a[:, np.newaxis, :].shape  # yields (5, 1, 6)
print a[:, :, np.newaxis].shape  # yields (5, 6, 1)

Typically this is used to adjust shapes to be able to broadcast to higher dimensions. E.g. tiling 7 times in the middle axis can be achieved as

b = a[:, np.newaxis] * np.ones((1, 7, 1))

print b.shape  # yields (5, 7, 6), 7 copies of a along the second axis
eickenberg
  • 14,152
  • 1
  • 48
  • 52
4

I think the Theano vector's __getitem__ method expects a tuple as an argument! like this:

class Vect (object):
    def __init__(self,data):
        self.data=list(data)

    def __getitem__(self,key):
        return self.data[key[0]:key[1]+1]

a=Vect('hello')
print a[0,2]

Here print a[0,2] when a is an ordinary list will raise an exception:

>>> a=list('hello')
>>> a[0,2]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: list indices must be integers, not tuple

But here the __getitem__ method is different and it accepts a tuple as an argument.

You can pass the : sign to __getitem__ like this as : means slice:

class Vect (object):
    def __init__(self,data):
        self.data=list(data)

    def __getitem__(self,key):
        return self.data[0:key[1]+1]+list(key[0].indices(key[1]))

a=Vect('hello')
print a[:,2]

Speaking about None, it can be used when indexing in plain Python as well:

>>> 'hello'[None:None]
'hello'
ForceBru
  • 43,482
  • 10
  • 63
  • 98
  • Indeed, calling `a[:, 3]` on an ordinary list gives `TypeError: list indices must be integers, not tuple`. However, I don't really see what the tuple here is. Could you elaborate a bit? – xrisk Jul 18 '15 at 15:42
  • the tuple is the comma :P Yeah, the actual thing in python that makes a tuple a tuple is a comma, not the parens. You could possibly write it a[ (:, 3) ] if you wanted to be more clear. – NightShadeQueen Jul 18 '15 at 15:46
  • @RishavKundu that's the reason why `("boo!")` is not a tuple, and `"merp",` is. – NightShadeQueen Jul 18 '15 at 16:20