I am trying to build a model in django 1.9 that has a key, value pair (dictionary) field that also allows query set aggregation (min, mix, etc). I've tried to use the JSONField:
#models.py
from django.contrib.postgres import fields as pgfields
class Entry(models.Model):
pass
class Scorer(models.Model):
name = models.CharField(max_length=100)
class EntryScoreSet(models.Model):
scorer = models.ForeignKey(Scorer)
entry = models.ForeignKey(Entry, related_name="scorecard")
scores = pgfields.JSONField(default={})
....
# shell test
import random
entry = Entry()
scorer,_ = Scorer.objects.get_or_create(name="scorer1")
entry.save()
for i in range(0,10):
scores = dict(scoreA=random.random(),
scoreB=random.random(),
scoreC=random.random(),
)
entry_score_set=EntryScoreSet(scores=scores, entry=entry, scorer=scorer)
entry_score_set.save()
entry.scorecard.filter(scorer="scorer1").aggregate(Max("scores__scoreA"))
But I run into the error from this ticket (basically, aggregation is not supported).
A second option is to use a key, value pair model (similar to this answer):
class Score(models.Model):
entry_score_set = models.ForeignKey(EntryScoreSet, db_index=True,
related_name="scores")
key = models.CharField(max_length=64, db_index=True)
value = models.FloatField(db_index=True)
But I don't know how one would get an aggregation across a query set for a particular a key value.
How would I implement a key, value pair field in Django that allows aggregation on a query set for a particular key's value?
EDIT:
Here is a snippet that demonstrates what I want to do using pandas and the second option (key, pair model):
import django_pandas.io as djpdio
scds=Scorecard.objects.filter(
entry__in=Entry.objects.order_by('?')[:10],
scorer__name="scorer1")
scorecard_base=djpdio.read_frame(scds,fieldnames=["id","entry__id","scorer__name","scores__id"])
scores=djpdio.read_frame(Score.objects.filter(scorecard__in=scds),fieldnames=["id","key","value"])
scorecard_=(scorecard_base
.merge(scores,left_on="scores__id",right_on="id")
.pivot_table(index="entry__id",columns="key",values="value").reset_index())
scorecard=scorecard_base.merge(scorecard_,on="entry__id")
scorecard["scoreA"].max()
Is something like this possible using django's ORM? How would the efficiency compare to using pandas pivot function?