0

enter image description here

I want my dataframe to get splitted into smaller dfs, based on 'z' value. In this case, 2 dfs as I only want to take whats between the zeros (z column). i.e. Dataframe1: 01/10/2018 0:30 - 1/10/2018 1:20 AND Dataframe2: 01/10/2018 2:00 - 1/10/2018 2:40

How can this be done in a loop for bigger datasets? Discarding the zeroes and only putting whats in between.

Danish Zahid Malik
  • 541
  • 2
  • 7
  • 19
  • Welcome to StackOverflow. Please take the time to read this post on [how to provide a great pandas example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on how to ask a good question may also be useful. – yatu Jun 12 '19 at 08:06

2 Answers2

2

You can use groupby for that.

grouped = df.groupby('z')    
dataframes = [grouped.get_group(x) for x in grouped.groups]#list of DataFrames
alec_djinn
  • 10,104
  • 8
  • 46
  • 71
0

Here, I am having a sample dataset with two columns and few sample rows. I have splitted this dataframe into three new dataframes based on a condition (col2 divisible by 3 and arrange them as per their remainder values).

from datetime import datetime, timedelta
import numpy as np
import pandas as pd

data = pd.DataFrame({'Col1':np.arange(datetime(2018,1,1),datetime(2018,1,12),timedelta(days=1)).astype(datetime),'Col2':np.arange(1,12,1)})
print('Data:')
print(data)

# split dataframe into three dataframes based on the col2 divisible by 3 
# col2 % 3 == 0 then data_0
# col2 % 3 == 1 then data_1
# col2 % 3 == 2 then data_2
data_0, data_1, data_2 = data[data['Col2']%3==0], data[data['Col2']%3==1],data[data['Col2']%3==2]
print('Data_0:')
print(data_0)
print('Data_1:')
print(data_1)
print('Data_2:')
print(data_2)

The generated output is as:

Data:
         Col1  Col2
0  2018-01-01     1
1  2018-01-02     2
2  2018-01-03     3
3  2018-01-04     4
4  2018-01-05     5
5  2018-01-06     6
6  2018-01-07     7
7  2018-01-08     8
8  2018-01-09     9
9  2018-01-10    10
10 2018-01-11    11
Data_0:
        Col1  Col2
2 2018-01-03     3
5 2018-01-06     6
8 2018-01-09     9
Data_1:
        Col1  Col2
0 2018-01-01     1
3 2018-01-04     4
6 2018-01-07     7
9 2018-01-10    10
Data_2:
         Col1  Col2
1  2018-01-02     2
4  2018-01-05     5
7  2018-01-08     8
10 2018-01-11    11

Hope, this may helps you.

Himmat
  • 166
  • 5