32

What is an idiomatic way to merge (or retrieve the union of) two lists (or sequences) in Clojure?

(merge l1 l2)

doesn't seem to be the solution:

a=> (merge '(1 2 3) '(2 3 4))
((2 3 4) 1 2 3)
user1639206
  • 321
  • 1
  • 3
  • 3
  • how do you define "merge"? e.g. do duplicates exist, and if so how are duplicates handled? also do you know if the lists are already sorted? – mikera Sep 30 '12 at 05:35
  • FYI. The function name, `merge`, has already been taken by `clojure.core`. To avoid confusion, you may choose another name for your `merge` function. See http://clojuredocs.org/clojure_core/clojure.core/merge – tnoda Sep 30 '12 at 07:24
  • The poster was in fact using clojure.core/merge, but not on hash-maps or otherwise associative data, and said function has undefined behavior in that context. – rplevy Sep 30 '12 at 22:04

5 Answers5

29

I think andih's solution works great. Here is an alternate way because hey why not. It uses concat and distinct:

user> (distinct (concat '(1 2 3) '(2 3 4)))
=> (1 2 3 4)
Omri Bernstein
  • 1,893
  • 1
  • 13
  • 17
  • 3
    I would just advise against concat for performance reasons, as it is surprisingly slow. See my answer below for further discussion. – rplevy Sep 30 '12 at 21:46
  • @rplevy thanks for the comment (and your answer below). I had not realized that concat has performance problem. – Omri Bernstein Sep 30 '12 at 22:23
17

If what you want is actually distinct unsorted data (sets), you should be using Clojure's set data structure instead of vectors or lists. And as andih suggested indirectly, there is a core library for set operations: http://clojure.github.com/clojure/clojure.set-api.html

(require '[clojure.set :refer [union]])

(union #{1 2 3} #{3 4 5})
=> #{1 2 3 4 5}

If sets are for whatever reason not what you want, then read on. Careful with concat when you have a significant amount of data in your sequences, and consider using into which is much better optimized as a vector merging algorithm. I don't know why concat is not implemented using into (or better yet-- why does concat even exist? BTW while into is significantly faster than concat, it is still way way slower than conj. Bagwell's RRB trees, compatible with both Clojure and Scala, will solve this problem, but are not yet implemented for Clojure).

To rephrase Omri's non-set solution in terms of 'into':

(distinct (into [1 2 3] [3 4 5]))
=> (1 2 3 4 5)
rplevy
  • 5,393
  • 3
  • 32
  • 31
  • what's so bad about concat? It's constant-time because it's lazy, not the linear-time implementation you get in strict languages. – Philip Potter Oct 01 '12 at 14:17
  • also, how can into be slower than conj? it's implemented using conj. `(into foo bar)` is the same as `(reduce conj foo bar)`, except that it will use transients if they are available. – Philip Potter Oct 01 '12 at 14:19
  • No, I said conj is faster! (but if you need to concatenate vectors and not merely conj one onto the other, these are different needs to have.) – rplevy Oct 01 '12 at 16:02
  • Philip: into uses transients, very similar to the transients for vector concatenation example in Joy of Clojure in fact, which is a good place to look for more discussion on this. – rplevy Oct 01 '12 at 16:04
  • @PhilipPotter surely `concat` cannot be constant time for all inputs? Has to be linear time as size of input lenghts grow? – Petrus Theron Mar 03 '14 at 10:10
  • 1
    @pete sorry for late reply. `concat` is lazy, so yes it can be constant time because it doesn't really do much work. it returns something which when consumed as a seq, first fetches items from the first arg, then from the second when the first is exhausted. – Philip Potter Oct 14 '14 at 21:55
15

One way to get the union of two lists is to use union

Clojure> (into #{} (clojure.set/union '(1,2,3) '(3,4,5)))
#{1 2 3 4 5}

or if you want to get a list

(into '() (into #{} (clojure.set/union '(1,2,3) '(3,4,5))))
(5 4 3 2 1)
andih
  • 5,570
  • 3
  • 26
  • 36
  • 4
    (-> #{} (into [1 2 3]) (into [3 4 5]) seq) – tnoda Sep 30 '12 at 07:14
  • You aren't really achieving anything with `union` operating on list arguments. All it does is `(reduce conj '(1,2,3) '(3,4,5))`. `clojure.set` functions are designed to work with set arguments. – Marko Topolnik Sep 30 '12 at 19:45
9

If you don't mind duplicates, you can try concat :

(concat '(1 2 3 ) '(4 5 6 1) '(2 3)) 
;;==> (1 2 3 4 5 6 1 2 3) 
Kevin Zhu
  • 2,746
  • 26
  • 23
2

One option is flatten:

(def colls '((1 2 3) (2 3 4)))
(flatten colls) ;; => (1 2 3 2 3 4)
(distinct (flatten colls)) ;; => (1 2 3 4)

One thing to be aware of is that it will flatten deeply nested collections:

(flatten [[1 2 [3 4 5]] [1 [2 [3 4]]]]) ;; => (1 2 3 4 5 1 2 3 4)

But works well for maps:

(flatten [[{} {} {}] [{} {} {}]]) ;; => ({} {} {} {} {} {})
Kris
  • 19,188
  • 9
  • 91
  • 111