Scala map2 over tuple with inner monoid (or: how to do this simple thing but better?)

Question

Situation:

A stream (RxScala) of events, which we are batching using tumblingBuffer() and then building the complete history of for debugging. Ultimately I want these in (Seq[T], Seq[T]) of all the values, so I created the following function as the accumulator to a foldLeft:

def tupleConcat[S](a: (Seq[S], Seq[S]), b: (Seq[S], Seq[S])) = (a._1 ++ b._1, a._2 ++ b._2)

Now, this to me set off a bunch of warning bells after having read Runar & Paul's Functional Programming in Scala, since this looks an awful lot like a map2 of two monoid instances, but I am still a bit stuck on how to generalize it properly. So far, I think it might look something like:

def tupleConcatSG[S](a: (S,S), b: (S,S))(implicit s: Semigroup[S]) = (a._1 |+| b._1, a._2 |+| b._2)

(but I'd have to promote my Seq to IndexedSeq from what I can gather).

To generalize further to any Applicative, I think I'd need an instance for tuples, which perhaps comes from Shapeless? Or am I missing something obvious?

EDIT: I should also add that I am trying to avoid zipping and unzipping, based on prejudicial performance concerns, but perhaps I should not worry about that... (each tumblingBuffer's worth of (Seq,Seq) will be ~15,000 long, and the final (Seq,Seq) ought to be in the millions).

score 6 · Accepted Answer · edited May 23 '17 at 12:08

The tuple part exists already; at worst you will need shapeless-scalaz. Your tupleConcatSG is fine (you could use the : Semigroup sugar rather than an explicit implicit), but if you want to be able to use |+| you'll need to make it a Semigroup instance, and available implicitly:

implicit def tupleConcatSg[S: Semigroup] = new Semigroup[(S, S)] {
  def append(f1: (S, S), f2: (S, S)) = ...
}

I suspect your real problem here is that scalaz doesn't come with any instances for Seq, only for IndexedSeq - see Why is List a Semigroup but Seq is not? . But you can write your own Monoid[Seq] easily enough if you're not worried about the performance implications of using Seq's ++ on Lists:

implicit object SeqMonoid extends Monoid[Seq]{...}

I'm not sure what you mean about generalizing further to any Applicative - we're already pretty general. If you're talking about getting Applicative instances for a composition of Applicatives like List[Writer[Vector[String], A]], that should happen automatically.

experquisite · Answer 2 · 2014-12-15T23:14:31.290

1

Following up to lmm's answer, is this really what I am trying to do? (the types check out, and I switched from Seq to IndexedSeq to make sure we can use ++ properly).

    def tupleConcatMonoid[M: Monoid] = new Monoid[(M, M)] {
      def zero: (M,M) = (Monoid[M].zero, Monoid[M].zero)
      def append(f1: (M, M), f2: (M, M)) = (f1._1 |+| f2._1, f2._1 |+| f2._2)
    }
    val isdMonoid= tupleConcatMonoid[IndexedSeq[Double]]
    val history = tumbledBuffers.foldLeft(isdMonoid.zero)(isdMonoid.append)

This has all the correct types, but I am not 100% confident on the level of 'magic'. I tried to do a

    tumbledBuffers.suml

But that kicked the vectors out of the Observable...

EDIT: What I guess is my real question is why this sort of monoid instance for tuples doesn't already exist, or if it does, what is the syntax for using it so that I can monoidally append tuples of vectors without doing the final step of monoidally appending the vectors themselves?

edited Dec 15 '14 at 23:14

answered Dec 15 '14 at 22:48

experquisite

879
5
14

I don't quite understand the question (it seems to be missing a word) - maybe you could give a concrete example of what you want to happen when the tuples are e.g. `(Vector(2, 3), Vector(4))` and `(Vector(), Vector(5))` or some such? You can't glom tuples together into wider tuples in a monoidal way (if nothing else, tuples only go up to size 22). If you want to "lazily" assemble your things without actually computing anything, that's a case for the Free Monoid, aka `List` or `Vector`. – lmm Dec 15 '14 at 23:08
If you want to use your `tupleConcatMonoid` with `suml` you should declare it `implicit` - but the same thing already exists in [TupleInstances](http://docs.typelevel.org/api/scalaz/nightly/#scalaz.std.TupleInstances), so with the correct imports (e.g. `scalaz._, Scalaz._`) it should just "be there" already. Is `tumbledBuffers` of a type for which a scalaz `Foldable` instance exists? (is this the same `Seq`/`IndexedSeq` problem again?) – lmm Dec 15 '14 at 23:09
I have definitively fixed the IndexedSeq problem, and my above tupleConcatMonoid seems to do the right thing - the only question now is, is it necessary to make that at all? Given your vectors, what I would like is: (Vector(2,3),Vector(4,5)) – experquisite Dec 15 '14 at 23:17
It should work without declaring one, and it does on scastie: http://scastie.org/7699 – lmm Dec 15 '14 at 23:30
Yeah okay so my remaining problem is strictly to do with RxScala. Thanks for your help! – experquisite Dec 15 '14 at 23:36

Scala map2 over tuple with inner monoid (or: how to do this simple thing but better?)

2 Answers2