3

I have some lists such as

list1 = ['hi',2,3,4]
list2 = ['hello', 7,1,8]
list3 = ['morning',7,2,1]

Where 'hi', 'hello' and 'morning' are strings, while the rest are numbers.

However then I try to stack them up as:

matrix = np.vstack((list1,list2,list3))

However the types of the numbers become string. In particular they become numpy_str.

How do I solve this? I tried replacing the items, I tried changing their type, nothing works

edit

I made a mistake above! In my original problem, the first list is actually a list of headings, so for example

list1 = ['hi', 'number of hours', 'number of days', 'ideas']

So the first column (in the vertically stacked array) is a column of strings. The other columns have a string as their first element and then numbers.

Euler_Salter
  • 3,271
  • 8
  • 33
  • 74
  • To mix strings and integers in an array, use a structured array or object array as demonstrated in the recent https://stackoverflow.com/q/44831502 – hpaulj Jun 30 '17 at 13:35
  • 1
    `np.vstack` passes each input list through `np.atleast_2d` which in turn uses `np.array`. Look at `np.array(list1)`. – hpaulj Jun 30 '17 at 16:31

2 Answers2

3

You could use Pandas DataFrames, they allow for heterogeneous data:

>>> pandas.DataFrame([list1, list2, list3])

         0  1  2  3
0       hi  2  3  4
1    hello  7  1  8
2  morning  7  2  1

If you want to name the columns, you can do that too:

pandas.DataFrame([list1, list2, list3], columns=list0)

        hi  nb_hours  nb_days  ideas
0       hi         2        3      4
1    hello         7        1      8
2  morning         7        2      1
Johannes
  • 3,300
  • 2
  • 20
  • 35
2

Since number can be written as strings, but strings can not be written as number, your matrix will have all its elements of type string.

If you want to have a matrix of integers, you can: 1- Extract a submatrix corresponding to your numbers and then map it to be integers 2- Or you can directly extract only the numbers from your lists and stack them.

import numpy as np
list1 = ['hi',2,3,4]
list2 = ['hello', 7,1,8]
list3 = ['morning',7,2,1]

matrix = np.vstack((list1,list2,list3))

# First
m = map(np.int32,matrix[:,1:])
# [array([2, 3, 4], dtype=int32), array([7, 1, 8], dtype=int32), array([7, 2, 1], dtype=int32)]

# Second
m = np.vstack((list1[1:],list2[1:],list3[1:]))
# [[2 3 4] [7 1 8] [7 2 1]]

edit (Answer to comment)

I'll call the title list list0:

list0 = ['hi', 'nb_hours', 'nb_days', 'ideas']

It's basically the same ideas:

1- Stack all then extract submatrix (Here we don't take neither first row neither first column: [1:,1:])

matrix = np.vstack((list0,list1,list2,list3))
matrix_nb = map(np.int32,matrix[1:,1:])

2- Directly don't stack the list0 and stack all the other lists (except their first element [1:]):

m = np.vstack((list1[1:],list2[1:],list3[1:]))
Nuageux
  • 1,668
  • 1
  • 17
  • 29
  • @Nuageoux what can I do if the first list, `list1` is a list of "headings" ? For example `list1 = ['hi', 'number of hours', 'number of days', 'ideas'] ? – Euler_Salter Jun 30 '17 at 12:59
  • @Euler_Salter I edited my answer with the *headings* as `list0`. Outputs are the same – Nuageux Jun 30 '17 at 13:04
  • Thank you! However this goes around the problem, without solving it, I think! I mean, I extract what I need and that is fine. However the headings of the columns and of the rows where important. I wanted this array to be something like a table. Would you say there is another way to create the table given this? – Euler_Salter Jun 30 '17 at 13:06
  • Also, this doesn't solve the problem, since they are still strings, if you go and check! – Euler_Salter Jun 30 '17 at 13:07
  • If it's still important you can keep the headers / the first elements of your lists as lists of strings, you can match them with the indexes to the integer numpy.array. Or, you keep your whole table of strings. It is not possible to have a mix of them (with numpy.array). You can look at Pandas otherwise. I don't see why you say it is still strings. It isn't, I checked my examples. – Nuageux Jun 30 '17 at 14:09