Indexing nested lists in python

Question

The core Problem: I have paths saved as a triple nested lists and I fail to index them properly.

Here is a minimal example

l1 = []
l11 = [[1], [1, 2], [1, 2, 3], [1, 2, 3, 4]]
l12 = [[5], [5, 6], [5, 6, 7], [5, 6, 7, 8]]
l1.append(l11)
l1.append(l12)

# Goal:
l_target = [[[1], [1, 2], [1, 2, 3]], [[5], [5, 6], [5, 6, 7]]]

print(l1[:][:4] == l_target)

Explaining the list in question: The first dimension is the number of paths. Each path is a list that represent the values at the different timesteps. Each timestep is a list, where every timestep has a different dimension, which is the reason I have a tripple nested list and not a matrix.

Since this is very confusing here is an example:

example

For this example my val paths are saved in h1 (usually multiple hundred, I reduced it to 2 for this example). I have expanded the first path and you can see that at time k it consists of all values up to time k.

The problem is that I want a single line code that remove all timesteps after k for all paths.

To me it seemed very simple as "[:][:k]" should do it, however it doesn't change the list at all (see h2 == h1). As stuff like [:, :k] doesn't work either I am quite confused.

Start by taking a step back: `l1[:]` is just the same as `l1`. So `l1[:][:k]` is the same as `l1[:k]`, which just means take the first k lists in the first dimension. Note for example that `l1[:][:1]` is `[[[1], [1, 2], [1, 2, 3], [1, 2, 3, 4]]]`. The array slice only considers the first dimension of arrays. You might find [this question](https://stackoverflow.com/questions/10625096/extracting-first-n-columns-of-a-numpy-matrix) useful instead. — Gamma032, Dec 04 '21 at 12:04
Why are `[1, 2, 3, 4]` and `[5, 6, 7, 8]` missing it l_target? They're the fourth elements in `l11` and `l12`. `l11[:4]` and `l12[:4]` both contain the entire lists. — ilikeMUDs, Dec 04 '21 at 12:04
If I understand correcty, you would like to take all possible elementary lists inside l1 and remove all those of size larger than k? — Patrik Staron, Dec 04 '21 at 12:12

Reti43 · Answer 1 · 2021-12-04T12:32:50.463

l1[:,:k] is syntax that works for numpy, not standard lists.

l1[:] just returns the whole list as is (actually a copy of it). So l1[:][:4] is like doing l1[:4], which takes the first four paths. This does not slice the timeseries of each path, for which you need a list comprehension.

>>> [path[:3] for path in l1]
[[[1], [1, 2], [1, 2, 3]], [[5], [5, 6], [5, 6, 7]]]

I have to question your structure though. If a path, for example a -> b -> c -> d -> e, is represented at each timestep by where you've been, instead of having the list of lists [[a], [a, b], [a, b, c], ...], just have [a, b, c, d, e] and by slicing that you can get your subpath at any timestep. Then, if each path has the same maximum length, you can put them in a numpy array and l1[:,:k] will work.

Thanks for your input I will try it out. I have a variant where at each timestep all except the last entry are sorted, so slicing doesn't work as well. I probably should have chosen a different save method as working storage is more important then cpu time. — Oliver, Dec 04 '21 at 18:47

Indexing nested lists in python

1 Answers1