3

Completely new to coding and pandas.

df

   Date         Particulars    Inwards  Code

1 2017-04-01         EFG           12800    01
2 2017-07-22         ABC           100      01
3 2017-09-05         BCD           10000    01
4 2018-03-13         ABC           2000     01

I wanted to output 3 dataframes from this df based on the df['Particulars'] column, i.e.

Output: df1

   Date         Particulars    Inwards  Code

2 2017-07-22         ABC           100      01
4 2018-03-13         ABC           2000     01

df2

   Date         Particulars    Inwards  Code

1 2017-04-01         EFG           12800    01

df3

   Date         Particulars    Inwards  Code

3 2017-09-05         BCD           10000    01

I have a way of doing it through:

 df1 = df1.append(df.loc[df['Particulars'] == 'ABC'], ignore_index=False)

while I initialise a list of Particulars and make dataframes and then do the above command using a for loop. But I am wondering if sort or groupby would be better options? And how exactly to apply them I tried groupby and sort but can't get the dataframe.

jpp
  • 159,742
  • 34
  • 281
  • 339
Sid
  • 3,749
  • 7
  • 29
  • 62
  • In this case you can just do: `df1 = df[df['Particulars'] == 'ABC']` and so on. – pault Apr 06 '18 at 16:20
  • @pault I am trying to avoid making a list of the unique items in 'Particulars' as its a 1000 row df and then setting up empty dataframes in the list(making a dictionary) and then looping through. :( I was hoping there was a way to split the dataframe based 'Particulars' column – Sid Apr 06 '18 at 16:22
  • `[y for x ,y in df.groupby('Particulars')]` – BENY Apr 06 '18 at 16:24

2 Answers2

3

You can create a dictionary of data frames by grouping your df on Particulars.

d = {index: label for index, label in df.groupby('Particulars')}

Now you can access each df using

d['ABC']

    Date        Particulars Inwards Code
2   2017-07-22  ABC         100     1
4   2018-03-13  ABC         2000    1
Vaishali
  • 37,545
  • 5
  • 58
  • 86
  • Nice, I forgot about this one. One question, though. Does calling `groupby` automatically mean it's O(n log n) as in the background it creates groups by sorting? I seem to remember this was the case. – jpp Apr 06 '18 at 17:04
  • @jpp, the complexity of loop including LC is O(n), not sure whether grouping impacts it further. Time to dog further :) – Vaishali Apr 06 '18 at 17:11
2

A dictionary comprehension is the cleanest way to structure your data:

d = {k: df[df['Particulars'] == k] for k in df['Particulars'].unique()}

Related: How do I create a variable number of variables?

jpp
  • 159,742
  • 34
  • 281
  • 339