Splitting datasets of various size into separate dataframes

Question

I am new in Python and pandas. I have a .csv file exported from some measurement equipment that has given me all the measurements taken over the course of a day in a single .csv file. I have already managed to produce a fairly tidy dataframe but I cannot work out. How to separate the measurements??

The dataframe is structured as follows:

+-------------------------+-------------------+--------------------+-----------+------+-------------+------+--------------+
| SetupTitle              | measurement_type  | nan                | nan       | nan  | nan         | nan  | nan      |
| MetaData                | TestRecord        | measurement number | nan       | nan  | nan         | nan  | nan      |
| DataName                | voltage 1         | voltage 2          | current 1 | ...  |         |      | data name 8  |
| DataValues              | data 1            | ...                |           |      |             |      | data 8   |
| ...                     |                   |                    |           |      |             |      |          |
| hundreds of data points |                   |                    |           |      |             |      |          |
| ...                     |                   |                    |           |      |             |      |          |
| SetupTitle              | measurement type  | nan                | nan       | nan  | nan         | nan  | nan      |
| etc...                  |                   |                    |           |      |             |      |          |
+-------------------------+-------------------+--------------------+-----------+------+-------------+------+--------------+

I would like to split each measurement into individual dataframes by using the "SetupTitle" value as a start point, but I'm not sure how to iterate through the column or how to extract the rest of the columns from each.

I think once they are split up I will be able to remove the setup row and metadata row and use them to name the dataframe which will give me a nice dataset.

Welcome to StackOverflow! You will find help here, provided you respect this site rules. As a new user, you should read [ask], and when it comes to pandas [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). Here I would have liked either a sample from the original csv file or a copy/paste-able sample from the dataframe or even both to be able to reproduce a minima. Without that I just feel unable to answer. — Serge Ballesta, Jun 14 '19 at 12:38

score 0 · Accepted Answer · answered Jun 14 '19 at 12:26

You can use cumsum to count the occurrences of a specific value and groupby to separate them:

s = df[name_of_column].eq('SetupTitle').cumsum()

then value of s will change every time there's a SetupTitle in your column. And you can access the blocks by:

# say we want to store them in a dict:
blocks = {}

for num_block, block in df.groupby(s):
    # do whatever you want with the group
    blocks[num_block] = block

Splitting datasets of various size into separate dataframes

1 Answers1