4

I found this question on going from a list to tuples using an iterator, but I'm dealing with a large data set.

I found this question for going from tuples to a list, actually there's a bunch of questions going from list of tuples to an array, but I haven't found how to go the other direction...

I have:

np.arrray([[   0    1]
           [   1    1]
           [   2    1]
           ..., 
           [1004    3]
           [1005    1]
           [1006    1]])

I want:

np.arrray([(   0    1)
           (   1    1)
           (   2    1)
           ..., 
           (1004    3)
           (1005    1)
           (1006    1)])

I don't necessarily need a numpy array of tuples, but I want to do it effciently.

Community
  • 1
  • 1
mattyd2
  • 158
  • 2
  • 10
  • *I don't necessarily need a numpy array of tuples, but I want to do it effciently* - have you tried a regular list comprehension ? – RomanPerekhrest Dec 02 '16 at 15:56
  • @RomanPerekhrest thanks for the quick reply. Isn't list comprehension the same as the answer in [this question](http://stackoverflow.com/a/23286299/2876684)? – mattyd2 Dec 02 '16 at 16:01
  • 1
    what can you do with a 1-d numpy array (or other sequence) of tuples that you can't already do directly with the 2-d numpy array? – jez Dec 02 '16 at 16:09
  • @jez I need to feed it back into a Natural Language Processing library called [Gensim](https://radimrehurek.com/gensim/)... – mattyd2 Dec 02 '16 at 16:13

2 Answers2

8

To efficiently vectorize this transformation, take the transpose

a = np.array([[   0 ,   1],
              [   1 ,   1],
              [   2 ,   1],
              [   0 ,   1],
              [   1 ,   1],
              [   2 ,   1] ])

at = a.T 

at= array([[0, 1, 2, 0, 1, 2],
          [1, 1, 1, 1, 1, 1]])

then zip to convert the two 1D lists into a list of tuples

zip(at[0],at[1])

This will significantly be faster than a list comprehension

But in python 3.4

list(zip(at[0],at[1])

Out []:
[(0, 1), (1, 1), (2, 1), (0, 1), (1, 1), (2, 1)]

EDIT: thanks to @Divakar

Possibly slightly faster by directly slicing : zip(a[:,0], a[:,1])

SerialDev
  • 2,777
  • 20
  • 34
1

You can use some simple and direct Python and get a list (or whatever) of tuples:

list(map(tuple, arr))

where map() simply applies tuple() to each line of the array in turn.

Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260