3

I have a Pandas dataframe of this kind

    start    end   compDepth compReleaseDepth compMeanRate
0     0.0   0.62  58.0999985              1.5          110
1    0.66   1.34  57.1399994                3           94
2    1.42    2.1  57.1399994              2.5           89
3    2.21   2.87  58.5699997              2.5           79
4    2.97   3.65  55.2399979              3.5           77
5    3.78   4.45  53.8600006              1.5           76
6    4.49   5.17  62.2700005              0.5           81
7    5.97   6.65  56.1899986              2.5           85

I need to serialise the data into JSON and I used df.to_json(orient='records') and it works fine.

However, I would like to nest the last 3 columns into a new header called "annotations". This is what I want to achieve, is there a simple way to do this?

[{
        "start": "0.0",
        "end": "0.62",
        "annotations": {
            "compDepth": "58.0999985",
            "compReleaseDepth": "1.5",
            "compMeanRate": "110"
        }
    }, {
        "start": "0.66",
        "end": "1.34",
        "annotations": {
            "compDepth": "57.1399994",
            "compReleaseDepth": "3",
            "compMeanRate": "94"
        }
    }, {
        "start": "1.42",
        "end": "2.1",
        "annotations": {
            "compDepth": "57.1399994",
            "compReleaseDepth": "2.5",
            "compMeanRate": "89"
        }
    }, {
        "start": "2.21",
        "end": "2.87",
        "annotations": {
            "compDepth": "58.5699997",
            "compReleaseDepth": "2.5",
            "compMeanRate": "79"
        }
    }, 
dimstudio
  • 154
  • 9

1 Answers1

2

One simple way is to nest yourself the data in a new column using to_dict

df['annotations'] = df[['compDepth','compReleaseDepth','compMeanRate']].to_dict(orient='records')

Then you use to_json(orient='records') only on the 3 columns you want in your final output

df[['start','end','annotations']].to_json(orient='records')
Ben.T
  • 29,160
  • 6
  • 32
  • 54
  • Indeed is a simple way! thanks a lot. One only question: I have this warning when executing your first code. Do you know why? `__main__:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy` – dimstudio Jul 16 '18 at 19:02
  • @dimstudio I don't get this warning, so I assume it's an issue with your `df`, which might be a slice of your original data or something similar. It's not link to these two lines of codes, so not sure how to help, but the link provides in the warning is a good way to understand why :) – Ben.T Jul 16 '18 at 19:11