My goal is to use the Smartsheet API to take data that is structured just like the data shown below and place it in a pandas.DataFrame object. The issue is that the data has several layers, creating an inconsistent hierarchy.
Task | Finish Date | Complete?
---------------------------------------------------------
- Big Task #1 | 09/05/19 | N
Subtask #1 | 09/04/19 | Y
- Subtask #2 | 09/04/19 | N
- Even Smaller Task #1 | 09/02/19 | N
An Even Smaller Task | 09/02/19 | Y
Even Smaller Task #2 | | Y
Subtask #3 | 09/01/19 | N
- Big Task #1 | 09/12/19 | Y
Subtask #1 | | Y
I'm aware of pandas' MultiIndex class, but the largest struggle is figuring out how to organize the data into an acceptable parameter input.
The Smartsheet API has a helpful "parentid" value in a cell object that tells me that "Subtask #2" is a parent of "Even Smaller Task #1". However, it only provides the immediate parent cell. I know how I could find the "root task" of a particular cell, but organizing the results into a tabular DataFrame is my largest challenge.
Nevertheless, it's a fun challenge! Has anyone experienced a similar problem or have any ideas?