4

I have a domain model made of Questions, each Question being associated with a number of Comments and Affirmations.

I would like to make a Datalog query which extracts a bunch of content attributes for each Question, as well as the number of associated Comments and Affirmations, including when these relationships are empty (e.g some Question has no Comment or no Affirmation), in which case the returned count should be 0.

I have seen the following Gist, which shows how to use (sum ...) and (or-join) combined with a 'weight' variable to get a zero-count when the relationship is empty.

However, I do not see how to make this work when there are 2 relationships. I tried the following, but the returned counts are not correct:

(def query '[:find (sum ?comment-weight) (sum ?affirmation-weight) ?text ?time ?source-identifier ?q
             :with ?uniqueness
             :where
             [?q :question/text ?text]
             [?q :question/time ?time]
             [?q :question/source-identifier ?source-identifier]
             (or-join [?q ?uniqueness ?comment-weight ?affirmation-weight]
               (and [?comment :comment/question ?q]
                 [?affirmation :affirmation/question ?q]
                 ['((identity ?comment) (identity ?affirmation)) ?uniqueness]
                 [(ground 1) ?comment-weight]
                 [(ground 1) ?affirmation-weight])
               (and [?comment :comment/question ?q]
                 [(identity ?comment) ?uniqueness]
                 [(ground 1) ?comment-weight]
                 [(ground 0) ?affirmation-weight])
               (and [?affirmation :affirmation/question ?q]
                 [(identity ?affirmation) ?uniqueness]
                 [(ground 1) ?affirmation-weight]
                 [(ground 0) ?comment-weight])
               (and [(identity ?q) ?uniqueness]
                 [(ground 0) ?comment-weight]
                 [(ground 0) ?affirmation-weight]))])

Originally asked on the Clojurians Slack.

Valentin Waeselynck
  • 5,950
  • 26
  • 43

2 Answers2

4

So to summarize, the trick is to consider each Question, Comment and Affirmation as a 'data point', which has a 'weight' of 0 or 1 for each count, and is identified uniquely (so that Datalog counts correctly, see :with). In particular, each Question has a zero weight for all counts.

The posted code was almost correct, it only needs to remove the first clause of the (or-join ...), which results in creating 'artificial' data points (Comment + Affirmation tuples) that pollute the counts.

The following should work:

[:find (sum ?comment-weight) (sum ?affirmation-weight) ?text ?time ?source-identifier ?q
 :with ?uniqueness
 :where
 [?q :question/text ?text]
 [?q :question/time ?time]
 [?q :question/source-identifier ?source-identifier]
 (or-join [?q ?uniqueness ?comment-weight ?affirmation-weight]
   (and 
     [?comment :comment/question ?q]
     [(identity ?comment) ?uniqueness]
     [(ground 1) ?comment-weight]
     [(ground 0) ?affirmation-weight])
   (and 
     [?affirmation :affirmation/question ?q]
     [(identity ?affirmation) ?uniqueness]
     [(ground 1) ?affirmation-weight]
     [(ground 0) ?comment-weight])
   (and 
     [(identity ?q) ?uniqueness]
     [(ground 0) ?comment-weight]
     [(ground 0) ?affirmation-weight]))]
Valentin Waeselynck
  • 5,950
  • 26
  • 43
  • this works. thank you! I'm still not totally clear on why it works though. i was under the impression that or-join would return the first match for each ?q - wouldn't that mean that when there is a :comment/question and an :affirmation/question that only the comment would be counted? i think i'm still not grokking how :with works – DizzyDez Nov 22 '17 at 06:41
  • No, think of or-join as a set union - not as the boolean OR – Valentin Waeselynck Nov 22 '17 at 06:51
1

You can call arbitrary functions within a query, so consider using datomic.api/datoms directly (or a subquery, or an arbitrary function of your own) to count the comments and affirmations.

Using datomic.api/datoms:

'[:find ?comment-count ?affirmation-count ?text ?time ?source-identifier ?q
  :where
  [?q :question/text ?text]
  [?q :question/time ?time]
  [?q :question/source-identifier ?source-identifier]
  [(datomic.api/datoms $ :vaet ?q :comment/question) ?comments]
  [(count ?comments) ?comment-count]
  [(datomic.api/datoms $ :vaet ?q :affirmation/question) ?affirmations]
  [(count ?affirmations) ?affirmation-count]]

Or using a subquery:

'[:find ?comment-count ?affirmation-count ?text ?time ?source-identifier ?q
  :where
  [?q :question/text ?text]
  [?q :question/time ?time]
  [?q :question/source-identifier ?source-identifier]
  [(datomic.api/q [:find (count ?comment) .
                   :in $ ?q
                   :where [?comment :comment/question ?q]]
     $ ?q) ?comment-count-or-nil]
  [(clojure.core/or ?comment-count-or-nil 0) ?comment-count]
  [(datomic.api/q [:find (count ?affirmation) .
                   :in $ ?q
                   :where [?affirmation :affirmation/question ?q]]
     $ ?q) ?affirmation-count-or-nil]
  [(clojure.core/or ?affirmation-count-or-nil 0) ?affirmation-count]]

Or using a custom function:

(defn count-for-question [db question kind]
  (let [dseq (case kind
               :comments (d/datoms db :vaet question :comment/question)
               :affirmations (d/datoms db :vaet question :affirmation/question))]
    (reduce (fn [x _] (inc x)) 0 dseq)))

'[:find ?comment-count ?affirmation-count ?text ?time ?source-identifier ?q
  :where
  [?q :question/text ?text]
  [?q :question/time ?time]
  [?q :question/source-identifier ?source-identifier]
  [(user/count-for-question $ ?q :comments) ?comment-count]
  [(user/count-for-question $ ?q :affirmations) ?affirmation-count]]
Francis Avila
  • 31,233
  • 6
  • 58
  • 96
  • Given the current fixed overhead of `datomic.api/q`, I would be careful with the subquery approach. – Valentin Waeselynck Nov 21 '17 at 16:29
  • using datomic.api/datoms i'm seeing: UnsupportedOperationException count not supported on this type: db$datoms$reify__2144 clojure.lang.RT.countFrom (RT.java:646) – DizzyDez Nov 22 '17 at 06:46
  • This error is because `datoms` returns an Iterable, on which `count` is not supported. Maybe you can work around that by coercing them using `(seq)`, or use a custom function. – Valentin Waeselynck Nov 22 '17 at 08:14