Pandas dataframe to JSON format manipulation

Question

I have pandas DataFrame like this one

User  Category    Rating
1     [1,2,3]     [5,1,3]
2     [3,2,1]     [3,1,1]
3     [1,3,1]     [2,1,4]

I want to write an endpoint that takes a user and returns the a list of categories and ratings for a particular user.

www.endpoint.com/user/1

Should return

[{Category: 1, Rating: 5}, {Category: 2, Rating: 1}, {Category: 3, Rating: 3}]

Is there a simple way to do this in Pandas?

MaxU - stand with Ukraine · Accepted Answer · 2017-08-14T17:35:55.687

I would use the following generic function which explodes lists in columns into rows:

def explode(df, lst_cols, fill_value=''):
    # make sure `lst_cols` is a list
    if lst_cols and not isinstance(lst_cols, list):
        lst_cols = [lst_cols]
    # all columns except `lst_cols`
    idx_cols = df.columns.difference(lst_cols)

    # calculate lengths of lists
    lens = df[lst_cols[0]].str.len()

    if (lens > 0).all():
        # ALL lists in cells aren't empty
        return pd.DataFrame({
            col:np.repeat(df[col].values, df[lst_cols[0]].str.len())
            for col in idx_cols
        }).assign(**{col:np.concatenate(df[col].values) for col in lst_cols}) \
          .loc[:, df.columns]
    else:
        # at least one list in cells is empty
        return pd.DataFrame({
            col:np.repeat(df[col].values, df[lst_cols[0]].str.len())
            for col in idx_cols
        }).assign(**{col:np.concatenate(df[col].values) for col in lst_cols}) \
          .append(df.loc[lens==0, idx_cols]).fillna(fill_value) \
          .loc[:, df.columns]

Demo:

In [88]: df
Out[88]:
   User   Category     Rating
0     1  [1, 2, 3]  [5, 1, 3]
1     2  [3, 2, 1]  [3, 1, 1]
2     3  [1, 3, 1]  [2, 1, 4]

In [89]: cols = ['Category','Rating']

In [90]: x = explode(df, cols)

In [91]: x
Out[91]:
   User Category Rating
0     1        1      5
1     1        2      1
2     1        3      3
3     2        3      3
4     2        2      1
5     2        1      1
6     3        1      2
7     3        3      1
8     3        1      4

Now we can easily do what you need:

In [92]: x.loc[x.User == 1, cols].to_dict('r')
Out[92]:
[{'Category': '1', 'Rating': '5'},
 {'Category': '2', 'Rating': '1'},
 {'Category': '3', 'Rating': '3'}]

score 0 · Answer 2 · answered Aug 14 '17 at 16:44

Here's one way

In [599]: func = lambda x: [{'Category':v, 'Rating': x.Rating[i]} 
                            for i, v in enumerate(x.Category)]

In [600]: func(df.loc[0])
Out[600]:
[{'Category': 1, 'Rating': 5},
 {'Category': 2, 'Rating': 1},
 {'Category': 3, 'Rating': 3}]

Or, apply to all rows

In [598]: df.apply(func, 1).values
Out[598]:
array([[{'Category': 1, 'Rating': 5}, {'Category': 2, 'Rating': 1},
        {'Category': 3, 'Rating': 3}],
       [{'Category': 3, 'Rating': 3}, {'Category': 2, 'Rating': 1},
        {'Category': 1, 'Rating': 1}],
       [{'Category': 1, 'Rating': 2}, {'Category': 3, 'Rating': 1},
        {'Category': 1, 'Rating': 4}]], dtype=object)

Pandas dataframe to JSON format manipulation

2 Answers2