2

This is a newbie question. I have a function that parse a web page and return a series of 5 elements. I then use the println function to see if it worked correctly.

...
(defn select-first-index-page-elements [source element n]
    ((get-parsing-logic source "parsing-logic-index-page" element "final-touch-fn")
        (nth 
            (html/select 
                (fetch-first-page source)
                (get-parsing-logic source "parsing-logic-index-page" element "first-touch"))
            n)))

(defn parsing-source [source]
(loop [n 0]
    (when (< n (count-first-index-page-elements source "title"))
(println ; the group of elements:
    (select-first-index-page-elements source "date" n)
    " - "
    (select-first-index-page-elements source "title" n)
    " - "
    (select-first-index-page-elements source "url" n)
    "\n")
(recur (inc n)))))))

(parsing-source "events-directory-website")

Now, instead of a println function, how could I store those elements into a DB? And how I can not store a given group of element if it is already in the db? How can I print then only the new group of elements that the parsing function did find?

sinemetu1
  • 1,726
  • 1
  • 13
  • 24
leontalbot
  • 2,513
  • 1
  • 23
  • 32

1 Answers1

3

You might want to check out SQL Korma.

Using sql korma:

how could I store those elements into a DB?

(insert my-elements
  (values [{:elements ("a" "b" "c")}]))

And how I can not store a given group of element if it is already in the db?

;; using some elements youre looking for
(if-not [is-in-db (select my-elements
                          (where {:elements the-elements-youre-looking-for}))]
  (insert my-elements
      (values [{:elements the-elements-youre-looking-for}])))

How can I print then only the new group of elements that the parsing function did find? You could solve this using the (select ...) call in the above answer.

Hope that helps.

sinemetu1
  • 1,726
  • 1
  • 13
  • 24
  • I get `CannotAcquireResourceException A ResourcePool could not acquire a resource from its primary factory or source. com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable (BasicResourcePool.java:1319)` when instead of `(println..` I put: `(let [next-url (select-first-index-page-elements source "url" n)] (if-not [db (select events (where {:url next-url}))] (let [next-date (select-first-index-page-elements source "date" n) next-title (select-first-index-page-elements source "title" n)] (insert events (values [{:date next-date :title next-title :url next-url}])))))` – leontalbot Feb 07 '13 at 02:21
  • Maybe checkout [this](http://stackoverflow.com/questions/3465872/com-mchange-v2-resourcepool-cannotacquireresourceexception-a-resourcepool-could). Make sure SQL is running, make sure you've declared the db somewhere in the code like [here](http://sqlkorma.com/docs#db). – sinemetu1 Feb 07 '13 at 03:52
  • Also, the simplest example of db declaration is probably [here](https://github.com/korma/Korma) under "Examples of generated queries:" where `defdb` is used. – sinemetu1 Feb 07 '13 at 03:54