0

I have an array that looks like this (only it goes down to 10000-ish values)

   A      B
0  0   10.0
1  1   15.0
2  2   14.0
3  0   10.0
4  1  100.0
5  2    1.4
6  3   20.0
7  0   12.0
8  1    4.0

I want to separate the data such that it looks like this

   A   B
0  0  10
1  1  15
2  2  14

   A      B
0  0   10.0
1  1  100.0
2  2    1.4
3  3   20.0

   A   B
0  0  12
1  1   4

It doesn't necessarily have to be three separate data frames but when I call column 'B' I would like the values to not all be grouped together.

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
  • Does a dictionary of DataFrames work for your purposes? `d = dict(tuple(df.groupby(df['A'].eq(0).cumsum())))` – Henry Ecker Nov 09 '21 at 00:20
  • You can insert a row with empty values before each row with `B = 0` – Barmar Nov 09 '21 at 00:20
  • @HenryEcker if I were to use a dictionary as you said, how would I call each set of values? – Jack Cahill Nov 09 '21 at 00:33
  • `d[1]` would be the first DataFrame. `d[2]` is the second dataframe etc. You could run the code in my comment on the sample DataFrame shown and `print(d)` to see what you end up with. – Henry Ecker Nov 09 '21 at 00:35
  • @HenryEcker That's exactly what I was looking for. Thank you! – Jack Cahill Nov 09 '21 at 00:46
  • You may also want `dfs = [x for _, x in df.groupby(df['A'].eq(0).cumsum())]` [like this answer by Anton vBR](https://stackoverflow.com/a/50866760/15497888) if you wanted a list instead of a dictionary. Which means that you'll have `dfs[0]` etc always 0 indexed. Depends on how you want to use your DataFrames. – Henry Ecker Nov 09 '21 at 00:48
  • @Henry I suggest the very same in my answer (which I somehow answered _after_ the question was closed ?!?!) –  Nov 09 '21 at 00:50
  • 1
    There's a delay in timing. It's no big deal. – Henry Ecker Nov 09 '21 at 00:51

1 Answers1

2

Try this (partially inspired by Henry Ecker's comment):

dfs = [df.reset_index(drop=True) for _, df in df.groupby(df['A'].eq(0).cumsum())]

Test:

>>> dfs[0]
   A     B
0  0  10.0
1  1  15.0
2  2  14.0

>>> dfs[1]
   A      B
0  0   10.0
1  1  100.0
2  2    1.4
3  3   20.0

>>> dfs[2]
   A     B
0  0  12.0
1  1   4.0