7

I have a lambda function that I'd like to add to and make a little more robust. However, I also want to follow PEP 8 and keep my line under 79 characters, correct indention, etc. but am having trouble thinking of how to split up the line.

The line is currently:

styled_df = df.style.apply(lambda x: ["background: rgba(255,0,0,.3)" if('BBC' in x['newsSource'] or 'Wall' in x['newsSource']) and idx==0 else "" for idx, v in enumerate(x)], axis = 1)

So far, I can get to this:

styled_df = df.style.apply(
    lambda x: ["background: rgba(255,0,0,.3)" 
        if('BBC' in x['newsSource'] or 'Wall' in x['newsSource']) and
        idx == 0 else ""
        for idx, v in enumerate(x)], axis=1)

but the third line (if('BBC' in ...) clashes with PEP 8 (E128). Also, admittedly, this isn't the clearest code the way I broke it up.

Also, I'm planning on adding more conditions to this, and am wondering what's the best way to do so. Just keep adding in, and breaking the lines up as best I can? Or is there a 'best practice' for such an issue?

Edit: As I thought, but forgot to mention, I likely could change this lambda in to a function, but am struggling with how. I'm new to lambdas and got this one from another problem altogether...So yes, agreed the Lambda is the root issue.

BruceWayne
  • 22,923
  • 15
  • 65
  • 110
  • 11
    You have a 150 character long lambda and are worried about pep8? Step 1 of making your code readable is transforming that lambda into a real function. – Aran-Fey Jul 25 '18 at 17:19
  • 3
    I would not be concern about PEP8 but more about that lambda. – scharette Jul 25 '18 at 17:20
  • Surely your editor supports one of these: https://stackoverflow.com/questions/1428872/pylint-pychecker-or-pyflakes – Andrej Kesely Jul 25 '18 at 17:26
  • @AndrejKesely - I'm using Anaconda with SublimeText3, which is pointing these out. – BruceWayne Jul 25 '18 at 17:27
  • @Aran-Fey - Haha, yeah...I know. As mentioned, I can probably cut that lambda down, but (as I edited in OP), I'm new to lambdas and am trying to figure how to turn that in to a function that Pandas' `.apply()` can read – BruceWayne Jul 25 '18 at 17:29
  • @BruceWayne Anaconda can be configured to automatically format PEP8 errors http://damnwidget.github.io/anaconda/IDE/#autoformat-pep8-errors – Andrej Kesely Jul 25 '18 at 17:30
  • "I likely could change this lambda in to a function, but am struggling with how" - now is a good time to learn! Step 1 is that a lambda `lambda x: some_expression` is equivalent to the function `f` defined by `def f(x): return some_expression`. Step 2 is taking advantage of the flexibility of a real function definition to extract subexpressions and separate things out into simpler individual statements. – user2357112 Jul 25 '18 at 17:33
  • @user2357112 - Thanks, I'm currently working through this. Also, it doesn't help that I'm also trying to learn how the `apply` works, and what exactly the `v`, `x` are in the formula. FYI This all came from [this answer, which I'm still absorbing :P](https://stackoverflow.com/a/51522997/4650297) – BruceWayne Jul 25 '18 at 17:35

2 Answers2

9

As a general rule, I think it's better to rely on automated tools to reformat your code than to try to figure out how to apply all those rules like a computer yourself.

The three main choices that I know of are:

  • black: Reformats at the syntax level; no configuration at all except for max line length.
  • yapf: Reformats at the syntax level; highly configurable.
  • autopep8 (and predecessors like pep8ify): Reformats only at the surface level; doesn't change anything that's already PEP 8 compliant. Useful if you want as few changes as possible (e.g., because you're reformatting long-standing code and don't want a massive changelist in source control).

Here's what black does with your code:

styled_df = df.style.apply(
    lambda x: [
        "background: rgba(255,0,0,.3)"
        if ("BBC" in x["newsSource"] or "Wall" in x["newsSource"]) and idx == 0
        else ""
        for idx, v in enumerate(x)
    ],
    axis=1,
)

That takes up a whole lot of vertical whitespace. But you can't configure black to treat it any differently. So, what can you do?

Whenever black insists on reformatting my code into something that doesn't look good to me, that means I have to reorganize my code into something easier to format.

The obvious thing to do here is turn that giant lambda into a def, or maybe even two:

def highlight(x):
    return "BBC" in x["newsSource"] or "Wall" in x["newsSource"]

def style(x):
    return [
        "background: rgba(255,0,0,.3)" if highlight(x) and idx==0 else ""
        for idx, v in enumerate(x)
    ]

styled_df = df.style.apply(style, axis=1)

Having done that… you aren't even using v; all you're doing is styling the first one (and idx == 0), and only if the news source includes BBC or Wall.

So, for BBC and Wall things, you're returning one background plus len(x)-1 empty strings; for other things, you're just returning len(x) empty strings.

Assuming that's the logic you wanted, let's be explicit about that:

def style(x):
    if "BBC" in x["newsSource"] or "Wall" in x["newsSource"]:
        first = "background: rgba(255,0,0,.3)"
        return [first] + [""]*(len(x)-1)
    return [""]*len(x)

styled_df = df.style.apply(style, axis=1)

You might prefer ["" for _ in range(x)] to [""]*len(x); I'm not really sure which is more readable here.


I likely could change this lambda in to a function, but am struggling with how. I'm new to lambdas and got this one from another problem altogether...So yes, agreed the Lambda is the root issue.

A lambda is a function, just like a def is. The only differences are:

  • def is a statement, so you can't put it in the middle of an expression.
  • lambda is an expression, so you can't include any statements in it.
  • def gives a function a name.

Other than that, the functions they compile work exactly the same. For example:

func = lambda x: expr(x)

def func(x): return expr(x)

… defines two functions with identical bytecode, and almost everything else the same, except that func.__name__ is 'func' for the def but something like '<lambda>' for the lambda.

More importantly, if you want to throw loop or a test into the function, with lambda you'll have to contort it into a comprehension or an if expression; with def, you can do that if it's appropriate, or use a compound statement if it isn't.

But, on the other hand, if there's no good name for the function, and it really isn't worth thinking about beyond its use as a callback function, lambda is better. For example, if you're defining a function just to return x - 3 that's only going to be used once, def would be silly.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • That last function works! Thanks! I am trying to break it down, but can't quite understand what the purpose of the `* len(x)` is? Or why you need to append `""` to the `[first]` ? Edit: And thanks so much for the excellent explanation on `lambda` :D – BruceWayne Jul 25 '18 at 17:42
  • @BruceWayne Your code returns `""` unless ` and idx==0`, so there are either `len(x)` empty strings, or one non-empty string and `len(x)-1` empty strings. I rewrote it to make that fact more obvious, but it may be more readable using a comprehension even after that rewrite. See the edit and let me know what you think. – abarnert Jul 25 '18 at 17:47
1

First things first: What's your code doing?

Let's read carefully:

lambda x: ["background: rgba(255,0,0,.3)" 
        if('BBC' in x['newsSource'] or 'Wall' in x['newsSource']) and
        idx == 0 else ""
        for idx, v in enumerate(x)]

You're only interested in the first element of x, because you do idx == 0. Remember that the index will be zero only in the first iteration. So if x has one 1,000,000 elements, you'll be evaluating 999,999 useless if conditions.

As far as I can understand, the explanation of your algorithm is:

Create a list of the same length of x in which every element is an empty string. If BBC or 'Wall' is present in x['newsSource'] make the first element of this new list be the string background: rgba(255,0,0,.3). Return this new list.

That's easy to codify:

def mysterious_function(x):
    new_list = [''] * len(x)

    if 'BBC' in x['newsSource'] or 'Wall' in x['newsSource']:
        new_list[0] = 'background: rgba(255,0,0,.3)'

    return new_list

You can now use the mysterious function in your current code:

styled_df = df.style.apply(mysterious_function, axis=1)

Isn't this better?

(And please give the function a better name)

Gabriel
  • 1,922
  • 2
  • 19
  • 37