Clean code for finding the intersection of tuples in a list

Question

I've written a algorithm to find parameters in a command line tool and are looking to clean up my code but is stuck.

The task

My program receives parameters as: flag output input ... | input flag output Examples are: -d one/path second/path/to/file.txt and second/path/to/file.txt --dir one/path etc. Each space is used as a delimiter to create an array of parameters. A parameter can be either a flag such as -d or a path.

I got each flag mapped out in two arrays, which I zip into a an array of tuples. I call them the search set.

In math notation

I'm new to both FP and math notations so please forgive my mistakes (I've learned from wikipedia and other sites).

S for search and P for parameters

S = { S₁, S₂, S₃ }
where Sn = { flagShort, flagLong }
where flagShort = '-d' | '-r' | '-o'
      flagLong = '--dir' | '--replace' | '--output'

P = { P₁, P₂, ... }
where Pn = flag | path
where path = dir | file

So I need to find the output by searching P for occurrences of Sn + the next parameter after the flag.

Ps = Sn ∩ P + Pₙ+₁

Input is just Ps ∉ P, so that is easy if I can get Ps.

Which leads me to the following transformation:

P -> Pn -> S -> Sn -> Sn = Pn -> Pn + Pₙ+₁

In javascript it can be written as:

const flagsShort = ["-r","-d","-o"]
const flagsLong = ["--replace","--dir","--output"]
const search =  _.zip(flagsShort, flagsLong)

let Ps = tuplesIntersectionParametersPlusNextP(
    ['one/path', '--dir', 'two/path', '-d', 'one/path', '-d'],
    search
)
// Ps -> [ '--dir', 'two/path', '-d', 'one/path' ]

function tuplesIntersectionParametersPlusNextP(P, S) {
    const Ps = [];
    P.forEach(  (Pn, n) => {
        S.forEach(Sn => {
            Sn.forEach(flag => {
                if(flag===Pn) Ps.push(Pn, P[n+1])
            })
        })
    })
    return Ps
}

While the code above works, it doesn't look clean. I've been looking around for different FP libraries such as underscore and different python articles but have yet to figure out how to use all these clever FP functions to clean up my code.

I will accept an answer in any language, Python, Haskell, Scala, etc but please don't use list comprehensions. While I'm very confident that I can port your code to js, I find list comprehensions a tad difficult to port. Better use map each reduce etc.

If you also can point me in the right direction with the set notations, I will be truly grateful!

can I ask you a question regarding this answer of yours please: http://stackoverflow.com/a/21067431/4828463 — Faraz, Jun 24 '16 at 12:04

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

Disclaimer

I tend to stay away from Javascript. There might be nicer ways to do certain things in Javascript, but I believe that the general principles still hold.

Your code is fine, albeit nested a little deep.

The trick in making your code cleaner is to extract abstract functionality. In the deepest nesting, what you're really asking is "does element Pn exist in the list of lists S?" This is something I can imagine myself asking more than once in an application, so it's perfect to turn into a function. You could even make this recursive for any level of nesting:

function ElementInNestedLists(e, l) {
    if (l instanceof Array) {
        return l.reduce(function(prev, cur, idx, arr) {
            if (prev || ElementInNestedLists(e, cur)) { 
                return true;
            }
            return false;
        }, false);
    } else {
        return l == e;
    }
}

You don't have to use reduce here. There's nothing forbidding you from doing an actual for-loop in functional programming, in languages that support it. This does a better job at preventing the function from continuing after your element has been found:

function ElementInNestedLists(e, l) {
    if (l instanceof Array) {
        for (elem of l) {
            if (ElementInNestedLists(e, elem)) {
                return true;
            }
        }
        return false;
    } else {
        return l == e;
    }
}

Using this new function, you can simplify tuplesIntersectionParametersPlusNextP:

function tuplesIntersectionParametersPlusNextP(P, S) {
    const Ps = [];
    P.forEach(  (Pn, n) => {
        if (ElementInNestedLists(Pn, S)) {
            Ps.push(Pn, P[n + 1]);
        }
    });
    return Ps;
}

But there's a bug. The output of this function, given your input, is [ '--dir', 'two/path', '-d', 'one/path', _undefined_ ], because the last flag has no parameter following it. We need to add a test to ensure that there are at least two elements remaining to be checked.

function tuplesIntersectionParametersPlusNextP(P, S) {
    const Ps = [];
    P.forEach(  (Pn, n) => {
        if (n + 1 < P.length && ElementInNestedLists(Pn, S)) {
            Ps.push(Pn, P[n + 1]);
        }
    });
    return Ps;
}

But there's another bug. Consider the input ['-d', '-d', 'foo']. The output would be ['-d', '-d', '-d', 'foo'], which is incorrect. The desired output is ['-d', '-d']. You could decide to add a variable to track whether or not you need to process the next field:

function tuplesIntersectionParametersPlusNextP(P, S) {
    const Ps = [];
    skip = false;
    P.forEach(  (Pn, n) => {
        if (skip) {
            skip = false;
            return;
        }
        if (n+1 < P.length && ElementInNestedLists(Pn, S)) {
            Ps.push(Pn, P[n + 1]);
            skip = true;
        }
    })
    return Ps
}

And while this does what you want, you've now lost cleanliness again. The solution (as is often the case in functional programming) is to solve this problem recursively.

function Parse(cmd, flags, acc = []) {
    if (cmd.length < 2) {
        return acc;
    }
    
    if (ElementInNestedLists(cmd[0], flags)) {
        acc.push(cmd[0], cmd[1]);
        return Parse(cmd.slice(2, cmd.length), flags, acc)
    }
    return Parse(cmd.slice(1, cmd.length), flags, acc)
}

And if you don't want to slice:

function Parse(cmd, flags, idx = 0, acc = []) {
    if (idx + 1 >= cmd.length) {
        return acc;
    }
    
    if (ElementInNestedLists(cmd[idx], flags)) {
        acc.push(cmd[idx], cmd[idx + 1]);
        return Parse(cmd, flags, idx + 2, acc);
    }
    return Parse(cmd, flags, idx + 1, acc);
}

Of course, you might want to check and discard — or otherwise handle — a flag following a flag. There might be other, more complex requirements that you either haven't mentioned or haven't thought of (yet), but those are outside the scope of this answer.

nice! I couldn't figure out how to do it recursively. Your steps was very educational. The reason I want to capture invalid flags, is that I want to throw descriptive errors and to that end, I need to know what the user did wrong. — dotnetCarpenter, Apr 24 '16 at 11:08

Clean code for finding the intersection of tuples in a list

The task

In math notation

1 Answers1

Disclaimer