0

I have a python list:

[[([1, 20230112060000], [10000, 20230112060000]),
  ([1, 20230108060000], [7000, 20230109060000]),
  ([3, 20221229060000], [6929, 20221229060000]),
  ([1, 20221227060000], [3900, 20221227060000]),
  ([1, 20221226060000], [6500, 20221226060000]),
  ([1, 20221221060000], [4400, 20221222060000]),
  ([1, 20221216060000], [3888, 20221216060000]),
  ([1, 20221205060000], [5998, 20221205060000]),
  ([1, 20221128060000], [5000, 20221128060000]),
  ([1, 20221127060000], [5000, 20221127060000]),
  ([1, 20221123060000], [5666, 20221123060000]),
  ([1, 20221122060000], [6000, 20221122060000]),
  ([1, 20221120060000], [4300, 20221120060000]),
  ([1, 20221118060000], [4998, 20221118060000]),
  ([1, 20221028050000], [2700, 20221028050000]),
  ([1, 20221027050000], [5000, 20221027050000]),
  ([1, 20221022050000], [4300, 20221022050000]),
  ([1, 20221019050000], [4498, 20221019050000]),
  ([1, 20221018050000], [3500, 20221018050000]),
  ([2, 20221015050000], [3899, 20221015050000]),
  ([1, 20221011050000], [4500, 20221011050000]),
  ([2, 20221008050000], [4850, 20221008050000]),
  ([2, 20221007050000], [5898, 20221007050000]),
  ([1, 20221004050000], [7499, 20221004050000]),
  ([1, 20221001050000], [3400, 20221001050000]),
...
 [],
 [([2, 20230206060000], [357500, 20230206060000])],
 [([2, 20230206060000], [357500, 20230206060000]),
  ([6, 20230205060000], [353833, 20230205060000])],
 ...]

But when I try to convert it to a NumPy array something weird happens:

import numpy as np
a = [...] # the above list
b = np.array(a)

b:

array([list([([1, 20230112060000], [10000, 20230112060000]), ([1, 20230108060000], [7000, 20230109060000]), ([3, 20221229060000], [6929, 20221229060000]), ([1, 20221227060000], [3900, 20221227060000]), ([1, 20221226060000], [6500, 20221226060000]), ([1, 20221221060000], [4400, 20221222060000]), ([1, 20221216060000], [3888, 20221216060000]), ([1, 20221205060000], [5998, 20221205060000]), ([1, 20221128060000], [5000, 20221128060000]), ([1, 20221127060000], [5000, 20221127060000]), ([1, 20221123060000], [5666, 20221123060000]), ([1, 20221122060000], [6000, 20221122060000]), ([1, 20221120060000], [4300, 20221120060000]), ([1, 20221118060000], [4998, 20221118060000]), ([1, 20221028050000], [2700, 20221028050000]), ([1, 20221027050000], [5000, 20221027050000]), ([1, 20221022050000], [4300, 20221022050000]), ([1, 20221019050000], [4498, 20221019050000]), ([1, 20221018050000], [3500, 20221018050000]), ([2, 20221015050000], [3899, 20221015050000]), ([1, 20221011050000], [4500, 20221011050000]), ([2, 20221008050000], [4850, 20221008050000]), ([2, 20221007050000], [5898, 20221007050000]), ([1, 20221004050000], [7499, 20221004050000]), ([1, 20221001050000], [3400, 20221001050000]), ([1, 20220928050000], [5000, 20220929050000]), ([1, 20220926050000], [3000, 20220926050000]), ([1, 20220925050000], [4500, 20220925050000]), ([1, 20220922050000], [4000, 20220922050000]), ([1, 20220920050000], [5000, 20220920050000]), ([1, 20220916050000], [8000, 20220916050000]), ([2, 20220915050000], [6625, 20220915050000]), ([2, 20220914050000], [4500, 20220914050000]), ([1, 20220903050000], [10000, 20220903050000]), ([1, 20220821050000], [8600, 20220821050000]), ([2, 20220820050000], [37500, 20220820050000]), ([1, 20220819050000], [30000, 20220819050000]), ([2, 20220818050000], [13999, 20220818050000]), ([1, 20220816050000], [4000, 20220817050000]), ([1, 20220815050000], [4000, 20220815050000])]),
       list([]), list([([1, 20230112060000], [10000, 20230112060000])]),
       ...,
       list([([1, 20230123060000], [5745, 20230123060000]), ([1, 20230105060000], [13000, 20230105060000]), ([1, 20221228060000], [6000, 20221228060000]), ([2, 20221227060000], [6000, 20221227060000]), ([1, 20221222060000], [8571, 20221222060000]), ([1, 20221218060000], [8250, 20221218060000]), ([1, 20221216060000], [8000, 20221216060000]), ([1, 20221213060000], [7500, 20221213060000]), ([1, 20221210060000], [3500, 20221210060000]), ([1, 20221109060000], [6500, 20221109060000])]),
       list([([1, 20230123060000], [5745, 20230123060000]), ([1, 20230105060000], [13000, 20230105060000]), ([1, 20221228060000], [6000, 20221228060000]), ([2, 20221227060000], [6000, 20221227060000]), ([1, 20221222060000], [8571, 20221222060000]), ([1, 20221218060000], [8250, 20221218060000]), ([1, 20221216060000], [8000, 20221216060000]), ([1, 20221213060000], [7500, 20221213060000]), ([1, 20221210060000], [3500, 20221210060000]), ([1, 20221109060000], [6500, 20221109060000]), ([1, 20220909050000], [9999, 20220909050000])]),
       list([([1, 20230123060000], [5745, 20230123060000]), ([1, 20230105060000], [13000, 20230105060000]), ([1, 20221228060000], [6000, 20221228060000]), ([2, 20221227060000], [6000, 20221227060000]), ([1, 20221222060000], [8571, 20221222060000]), ([1, 20221218060000], [8250, 20221218060000]), ([1, 20221216060000], [8000, 20221216060000]), ([1, 20221213060000], [7500, 20221213060000]), ([1, 20221210060000], [3500, 20221210060000]), ([1, 20221109060000], [6500, 20221109060000]), ([1, 20220909050000], [9999, 20220909050000]), ([1, 20220901050000], [8444, 20220901050000])])],
      dtype=object)

For some reason, the tuples and the lists are not converted properly. Because of this, b does not function like a normal NumPy array because all of the items are objects. I know I could go through and covert all of the tuples to lists, but is there a way to force NumPy to convert everything properly?

By the way, by converted properly I mean instead of:

array([list([()])])

It should be converted like:

array([[[]]])
catasaurus
  • 933
  • 4
  • 20
  • Does this answer your question? [List of lists into numpy array](https://stackoverflow.com/questions/10346336/list-of-lists-into-numpy-array) – Amin S Feb 13 '23 at 03:21
  • 1
    "For some reason, the tuples and the lists are not converted properly." What do you think should be the result instead? **Why**? Can you demonstrate the problem with a smaller input? Please read [ask] and [mre]. – Karl Knechtel Feb 13 '23 at 03:23
  • I think the problem might be because not all of your list are the same size e.g. `np.array([[1, 2], [3, 4]])` is fine, but `np.array([[1, 2], [3, 4, 5]])` caused `Creating an ndarray from ragged nested sequences` warning. Either filter or add default value to the list such that every element is the same size first (The problem is `list([])` part - probably) – Wakeme UpNow Feb 13 '23 at 03:28
  • Your input appears to be jagged. NumPy doesn't do jagged arrays - strict shape requirements are crucial to NumPy's efficiency. – user2357112 Feb 13 '23 at 03:36
  • Oh, so I have to pad the individual arrays so they are all the same length? – catasaurus Feb 13 '23 at 03:37
  • That would be one way to handle things. Whether it's appropriate for your use case, we can't tell. – user2357112 Feb 13 '23 at 03:38
  • @catasaurus Yes, you have to. Or to remove them from the list. Or to shorten all the others... Numpy array are hyper-rectangular (meaning that they have a regular size in all dimensions). – chrslg Feb 13 '23 at 03:38
  • That `list([])` stands out like a sore thumb! It isn't the mix of tuples versus lists, it's their size that varies. – hpaulj Feb 13 '23 at 03:57

1 Answers1

1

As correctly pointed out by Wakeme UpNow, your issue is in the fact that your lists are not of the same size. The key point you should understand in working with NumPy is that it gains its' performance from making some premises about your data, i.e.,:

    1. It is numeric
    1. It is of the same type
    1. All the sub-arrays are of the same length

If you break one of these premises, you automatically loose all the gains which otherwise you'd get from NumPy use, as it will fall back to a purly pythonic behavior (i.e., by dtype=object).

In-depth NumPy discussion may be found here.

So a way to fix this issue of yours would be to use numeric values, of the same data type, of arrays of the the same length.

Cheers

Michael
  • 2,167
  • 5
  • 23
  • 38