5

I have a DF that looks something like:

A    B    C    D    E
1    1    NaN  1    1
NaN  2    3    4    NaN

When I do:

df.to_json(orient='records')

I get something like

[{"A":1,"B":1,"C":null,"D":1,"E":1},{"A":null,"B":2,"C":3,"D":4,"E":null}]

Is there anyway to get it to ignore anything that has NaN and show something like:

[{"A":1,"B":1,"D":1,"E":1},{"B":2,"C":3,"D":4}]

Can I do this using pandas?

piRSquared
  • 285,575
  • 57
  • 475
  • 624
JS noob
  • 429
  • 5
  • 14
  • This is close enough: https://stackoverflow.com/questions/46166112/saving-a-pandas-dataframe-to-separate-jsons-without-nans -- I don't know whether it'd be considered a duplicate or not on StackOverflow (I'm relatively new). – Zain Patel Sep 25 '18 at 18:55
  • Let me offer the solution and close it `[dict(zip(x.index.get_level_values(1),x)) for _,x in df.replace('NAN',np.nan).stack().groupby(level=0)]` – BENY Sep 25 '18 at 18:57

2 Answers2

8

Try this:

[{**x[i]} for i, x in df.stack().groupby(level=0)]

[{'A': 1.0, 'B': 1.0, 'D': 1.0, 'E': 1.0}, {'B': 2.0, 'C': 3.0, 'D': 4.0}]

If you want int

[{**x[i]} for i, x in df.stack().map(int).groupby(level=0)]

[{'A': 1, 'B': 1, 'D': 1, 'E': 1}, {'B': 2, 'C': 3, 'D': 4}]

Hacky way to keep int if they are int

[{**x[i]} for i, x in df.stack().fillna(0, downcast='infer').groupby(level=0)]

[{'A': 1, 'B': 1, 'D': 1, 'E': 1}, {'B': 2, 'C': 3, 'D': 4}]

Explanation

#    Series with a
#       MultiIndex       Make a Series and drop nulls
#                ↓       ↓                     ↓ Essentially grouping by `index` of `df`
[{**x[i]} for i, x in df.stack().groupby(level=0)]
# ↑   ↑
# ↑   Slice the MultiIndex with name of the group
# Unpack in a dictionary context with double splat `{**mydict} == mydict`
piRSquared
  • 285,575
  • 57
  • 475
  • 624
0

Here is a previous answer for removing dicitonary keys when their values are null:

{k: v for k, v in metadata.items() if v is not None}

https://stackoverflow.com/a/12118700/8265971

For pandas, there is a pandas.DataFrame.dropna function. If those values will be assigned to a column, this would work nicely: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html

gmuraleekrishna
  • 3,375
  • 1
  • 27
  • 45
rlowe
  • 11
  • 6
  • @gmuraleekrishna why would you delete and replace my answers with my original answers? You didn't actually edit anything. – rlowe Sep 27 '18 at 14:05