5

I have two sequences (of tuples) on which I need to do a join:

  • Seq 1: [(City1 * Pin1), (City2 * Pin2), (City1 * Pin3), (City1 * Pin4)]
  • Seq 2: [(Pin1 * ProductA), (Pin2 * ProductB), (Pin1 * ProductC), (Pin2 * ProductA)]

into the sequence (of tuples):

  • [(City1 * ProductA), (City2 * ProductB), (City * ProductC), (City2 * Product A)...]

In C# I could do this using the Linq Join extension method like:

seq1.Join(seq2, t => t.Item2, t=> t.Item1,
    (t,u) => Tuple.Create(t.Item1, u.Item2))

How do I accomplish this in F#? I cannot find join on Seq there.

SharePoint Newbie
  • 5,974
  • 12
  • 62
  • 103

3 Answers3

6

Edit: Actually, you can just use LINQ:

> open System.Linq;;
> let ans = seq1.Join(seq2, (fun t -> snd t), (fun t -> fst t), (fun t u -> (fst t, snd u)));;

Why not use F#'s native Seq functions? If you look at the docs and at this question you can simply use these instead of LINQ. Take the Seq.map2 function for example:

> let mapped = Seq.map2 (fun a b -> (fst a, snd b)) seq1 seq2;;

val it : seq<string * string> =
  seq [("city1", "product1"); ("city2", "product2")]

should give you what you want, where seq1 and seq2 are your first and second sequences.

Community
  • 1
  • 1
Callum Rogers
  • 15,630
  • 17
  • 67
  • 90
  • Cities would be one to many with pin and pins would be many to many with products. Could you explain how it would work? – SharePoint Newbie Sep 23 '10 at 10:58
  • Do you mean that you could have `[(City1 * Pin1 * Pin2), (City2 * Pin2)]` and `[(Pin1 * ProductA), (Pin2 * ProductB * Productc)]` ie using tuples which more than 2 elements? – Callum Rogers Sep 23 '10 at 11:06
  • No, I mean I could have multiple items in the sequence with same city and different pin. Similarly I could have multiple items with same pin and different product or vice versa in seq 2. The tuple will always have 2 items though. – SharePoint Newbie Sep 23 '10 at 11:08
  • 1
    I haven't checked map2 before, thanks. Your code looks more readable. – Artem Koshelev Sep 23 '10 at 11:11
  • @Share: So more like `[(City1 * Pin1), (City1 * Pin2), (City2 * Pin2)]` and `[(Pin1 * Product1), (Pin2 * Product1), (Pin3 * Product2)]`? – Callum Rogers Sep 23 '10 at 11:13
  • A pin cannot be in more than one city, but yes somewhat like that. – SharePoint Newbie Sep 23 '10 at 11:15
3

F# Interactive session:

> let seq1 = seq [("city1", "pin1"); ("city2", "pin2")];;

val seq1 : seq<string * string> = [("city1", "pin1"); ("city2", "pin2")]

> let seq2 = seq [("pin1", "product1"); ("pin2", "product2")];;

val seq2 : seq<string * string> = [("pin1", "product1"); ("pin2", "product2")]

> Seq.zip seq1 seq2;;
val it : seq<(string * string) * (string * string)> =
  seq
    [(("city1", "pin1"), ("pin1", "product1"));
     (("city2", "pin2"), ("pin2", "product2"))]
> Seq.zip seq1 seq2 |> Seq.map (fun (x,y) -> (fst x, snd y));;
val it : seq<string * string> =
  seq [("city1", "product1"); ("city2", "product2")]

Also, you must be able to use Linq queries on sequences, just be sure you have a reference to the System.Linq assembly and opened a namespace open System.Linq

UPDATE: in a complex scenario you can use sequence expressions as follows:

open System

let seq1 = seq [("city1", "pin1"); ("city2", "pin2"); ("city1", "pin3"); ("city1", "pin4")]
let seq2 = seq [("pin1", "product1"); ("pin2", "product2"); ("pin1", "product3"); ("pin2", "product1")]

let joinSeq = seq { for x in seq1 do
                        for y in seq2 do
                            let city, pin = x
                            let pin1, product = y
                            if pin = pin1 then
                                yield(city, product) }
for(x,y)in joinSeq do
    printfn "%s: %s" x y

Console.ReadKey() |> ignore
Artem Koshelev
  • 10,548
  • 4
  • 36
  • 68
3

I think that it is not exactly clear what results are you expecting, so the answers are a bit confusing. Your example could be interpreted in two ways (either as zipping or as joining) and they are dramatically different.

  • Zipping: If you have two lists of the same length and you want to align correspoding items (e.g. 1st item from first list with 1st item from the second list; 2nd item from first list with 2nd item from the second list, etc..), then look at the answers that use either List.zip or List.map2.

    However, this would mean that the lists are sorted by pins and pins are unique. In that case you don't need to use Join and even in C#/LINQ, you could use Zip extension method.

  • Joining: If the lists may have different lengths, pins may not be sorted or not unique, then you need to write a real join. A simplified version of the code by Artem K would look like this:

    seq { for city, pin1 in seq1 do 
            for pin2, product in seq2 do 
              if pin1 = pin2 then yield city, product }
    

    This may be less efficient than Join in LINQ, because it loops through all the items in seq2 for every item in seq1, so the complexity is O(seq1.Length * seq2.Length). I'm not sure, but I think that Join could use some hashing to be more efficient. Instead of using Join method directly, I would probably define a little helper:

    open System.Linq
    module Seq = 
      let join (seq1:seq<_>) seq2 k1 k2 =
        seq1.Join(seq2, (fun t -> k1 t), (fun t -> k2 t), (fun t u -> t, u)) 
    

    Then you can write something like this:

    (seq1, seq2) 
       ||> Seq.join snd fst 
       |> Seq.map (fun (t, u) -> fst t, snd u)
    

Finally, if you know that there is exactly one unique city for every product (the sequences have the same length and pins are unique in both of them), then you could just sort both sequences by pins and then use zip - this may be more efficient than using join (especially if you could keep the sequence sorted from some earlier operations).

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553