I have a list of lists (each sublist of the same length) of tuples (each tuple of the same length, 2). Each sublist represents a sentence, and the tuples are bigrams of that sentence.
When using np.asarray
to turn this into an array, python seems to interpret the tuples as me asking for a 3rd dimension to be created.
Full working code here:
import numpy as np
from nltk import bigrams
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
bi_grams = []
for sent in arr:
bi_grams.append(list(bigrams(sent)))
bi_grams = np.asarray(bi_grams)
print(bi_grams)
So before turning bi_grams
to an array it looks like this: [[(1, 2), (2, 3)], [(4, 5), (5, 6)], [(7, 8), (8, 9)]]
Output of above code:
array([[[1, 2],
[2, 3]],
[[4, 5],
[5, 6]],
[[7, 8],
[8, 9]]])
Converting a list of lists to an array in this way is normally fine, and creates a 2D array, but it seems that python interprets the tuples as an added dimension, so the output is of shape (3, 2, 2)
, when in fact I want, and was expecting, a shape of (3, 2)
.
The output I want is:
array([[(1, 2), (2, 3)],
[(4, 5), (5, 6)],
[(7, 8), (8, 9)]])
which is of shape (3, 2)
.
Why does this happen? How can I achieve the array in the form/shape that I want?