1

If obj is a list of "ragged nested sequences", like obj = [1, 2, [3, 4]], then np.array(obj) produces a warning: "Creating an ndarray from ragged nested sequences ... is deprecated..."

How can I check in advance (without getting the warning) whether a given value is defined as "ragged nested sequences"?

royk
  • 85
  • 7
  • `np.array(obj, dtype=object)` will silence the warning. You can then check the `shape`. `np.array(obj, dtype=int)` will raise an error if it is ragged. Basically it's "ragged" if `np.array` can't make a 'clean' numeric dtype array. – hpaulj Jun 29 '22 at 00:01
  • thanks. though at the end what i am looking for is a function that returns True/False indicating if obj is ragged or not, without issuing a warning. – royk Jun 29 '22 at 00:25
  • There are cases where it will raise an error, regardless. If the first dimension(s) match, but trailing ones differ. The key thing is that `np.array` tries first to make a multidimensional numeric array. The ragged array (and warning) is a falback option. `raggedness` is something you should seek to avoid when constructing the nested list. – hpaulj Jun 29 '22 at 00:39
  • 1
    Specific [warnings](https://stackoverflow.com/questions/15933741/how-do-i-catch-a-numpy-warning-like-its-an-exception-not-just-for-testing) can be promoted to errors and handled with `try`, `except` blocks. This would use `np.array` inside `try` to raise the warning, but will stop working after the deprecation is removed. – Michael Szczesny Jun 29 '22 at 01:30
  • yes that will work - thanks! – royk Jun 29 '22 at 07:32

1 Answers1

1

Here's one way you could do it. The function get_shape(a) is a recursive function that returns either the shape of a as a tuple, or None if a does not have a regular shape (i.e. a is ragged). The tricky part is actually the function is_scalar(a): we want string and bytes instances, and arbitrary noniterable objects (such as None, or Foo() where Foo is class Foo: pass) to be considered scalars. np.iscalar() does some of the work; an attempt to evaluate len(a) does the rest. The NumPy docs suggest the simple expression np.ndim(a) == 0, but that will invoke the array creation for a, which will trigger the warning if a is ragged. (The is_scalar function might miss some cases, so test it carefully with typical data that you use.)

import numpy as np


def is_scalar(a):
    if np.isscalar(a):
        return True
    try:
        len(a)
    except TypeError:
        return True
    return False


def get_shape(a):
    """
    Returns the shape of `a`, if `a` has a regular array-like shape.

    Otherwise returns None.
    """
    if is_scalar(a):
        return ()
    shapes = [get_shape(item) for item in a]
    if len(shapes) == 0:
        return (0,)
    if any([shape is None for shape in shapes]):
        return None
    if not all([shapes[0] == shape for shape in shapes[1:]]):
        return None
    return (len(shapes),) + shapes[0]


def is_ragged(a):
    return get_shape(a) is None

For example,

In [114]: is_ragged(123)
Out[114]: False

In [115]: is_ragged([1, 2, 3])
Out[115]: False

In [116]: is_ragged([1, 2, [3, 4]])
Out[116]: True

In [117]: is_ragged([[[1]], [[2]], [[3]]])
Out[117]: False

In [118]: is_ragged([[[1]], [[2]], [[3, 99]]])
Out[118]: True
Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214