2

My goal is to use the Smartsheet API to take data that is structured just like the data shown below and place it in a pandas.DataFrame object. The issue is that the data has several layers, creating an inconsistent hierarchy.

             Task              | Finish Date | Complete?
---------------------------------------------------------
- Big Task #1                  |  09/05/19   |  N
   Subtask #1                  |  09/04/19   |  Y
 - Subtask #2                  |  09/04/19   |  N
    - Even Smaller Task #1     |  09/02/19   |  N
        An Even Smaller Task   |  09/02/19   |  Y
      Even Smaller Task #2     |             |  Y
   Subtask #3                  |  09/01/19   |  N
- Big Task #1                  |  09/12/19   |  Y
   Subtask #1                  |             |  Y

I'm aware of pandas' MultiIndex class, but the largest struggle is figuring out how to organize the data into an acceptable parameter input.

The Smartsheet API has a helpful "parentid" value in a cell object that tells me that "Subtask #2" is a parent of "Even Smaller Task #1". However, it only provides the immediate parent cell. I know how I could find the "root task" of a particular cell, but organizing the results into a tabular DataFrame is my largest challenge.

Nevertheless, it's a fun challenge! Has anyone experienced a similar problem or have any ideas?

benfwalla
  • 23
  • 4

1 Answers1

1

You could consider working through the hierarchy of the sheet to get an understanding of how many levels there are and then make additional columns for those levels. Then fill in the data in the appropriate column for the task data. It could be something similar to this:

Parent | Child | Grand-Child |
------------------------------
Task   | Sub-task | Next sub-task |

Or you could have all of the Task values listed in one column and add another column to give row numbers in the data set of what they are related to if they have a parent.

daveskull81
  • 627
  • 4
  • 7