1
data = [['297348640', 'Y', '12', 'Y'],
       ['300737722','Y', '1', 'Y'],
       ['300074407', 'Y',  '1', 'N']]

I want to convert this into a NumPy array so I did:

data = np.array(data)

The above is working fine.

Now I have two lists, say

a = [0,2,6]
b = [21,21,9]

I have to append these at last of my previous list:

data = [['297348640', 'Y', '12', 'Y', 0, 21],
       ['300737722','Y', '1', 'Y', 2, 21],
       ['300074407', 'Y',  '1', 'N', 6, 9]]

I tried this but its giving me wrong dimension error

a = np.array([a])
b = np.array([b])

data = np.hstack(data,(a))
data = np.hstack(data,(b))

ValueError: all the input arrays must have same number of dimensions 
cs95
  • 379,657
  • 97
  • 704
  • 746
user3560077
  • 152
  • 1
  • 2
  • 10
  • `hstack(data, (a))` did not give you that error message. It complains about the number of arguments, not the dimensions. – hpaulj Jan 14 '18 at 20:52

2 Answers2

4

Similar to @cᴏʟᴅsᴘᴇᴇᴅ's solution, but instead of passing dtype=object, you can be more explicit by passing data's dtype:

data = np.array([['297348640', 'Y', '12', 'Y'],
                 ['300737722','Y', '1', 'Y'],
                 ['300074407', 'Y',  '1', 'N']])

a = [0,2,6]
b = [21,21,9]

a = np.array(a, dtype=data.dtype)
b = np.array(b, dtype=data.dtype)

data = np.hstack((data, a[:, None], b[:, None]))

The first argument to np.hstack is a sequence of arrays. Right now, you are passing np.hstack(data,(a)), which actually gets parsed as two arguments. Adding an additional set of parantheses brings data and a (and b) into one sequence (a tuple).

And lastly as for the indexing: In numpy, what does selection by [:,None] do?. This is mimicking np.reshape().

Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
2

Your current data array is an array of strings, which means that column stacking integer columns will result in those being coerced to strings.

So, to prevent that, convert data to astype during the column stacking.

Furthermore, the dimensions of a and b must match up in some manner. For column stacking, they need to be column vectors of the same height as data, so that the stacking can be done. For that, you can use `np.reshape.

Finally, for the stacking, np.hstack/np.column_stack/np.concatenate all work.

np.concatenate(
   (data.astype(object), np.reshape(a, (-1, 1)), np.reshape(b, (-1, 1))), 
   axis=1
)

Or,

np.column_stack(
   (data.astype(object), np.reshape(a, (-1, 1)), np.reshape(b, (-1, 1)))
)

Or,

np.hstack(
   (data.astype(object), np.reshape(a, (-1, 1)), np.reshape(b, (-1, 1)))
)

array([['297348640', 'Y', '12', 'Y', 0, 21],
       ['300737722', 'Y', '1', 'Y', 2, 21],
       ['300074407', 'Y', '1', 'N', 6, 9]], dtype=object)
cs95
  • 379,657
  • 97
  • 704
  • 746