Recursion | Two function names

Question

Below is code from Douglas Crockford's The Good Parts.

For most part the code makes sense. Except, I don't understand this line here:

var walk_the_DOM = function walk(node, func) {

as it appears the function is given two names - walk_the_dom() and walk()

Further down you can see the code is actually called both ways so that indeed both of these names reference the function.

Why is this function given two names?

// Define a walk_the_DOM function that visits every
// node of the tree in HTML source order, starting
// from some given node. It invokes a function,
// passing it each node in turn. walk_the_DOM calls
// itself to process each of the child nodes.

var walk_the_DOM = function walk(node, func) {
    func(node);
    node = node.firstChild;
    while (node) {

        // walk() called here

        walk(node, func);
        node = node.nextSibling;
    }
};

// Define a getElementsByAttribute function. It
// takes an attribute name string and an optional
// matching value. It calls walk_the_DOM, passing it a
// function that looks for an attribute name in the
// node. The matching nodes are accumulated in a
// results array.

var getElementsByAttribute = function (att, value) {
    var results = [];

   // walk_the_DOM() called here

    walk_the_DOM(document.body, function (node) { 
        var actual = node.nodeType === 1 && node.getAttribute(att);
        if (typeof actual === 'string' &&
                (actual === value || typeof value !== 'string')) {
            results.push(node);
        }
    });

    return results;
};

Dmytro Shevchenko · Accepted Answer · 2012-09-12T12:23:42.930

5

This is in order for the recursion to work safely, I would imagine.

For example, if you wanted to only use the walk_the_DOM name, the variable could be reassigned later on, or not accessible due to the scope, so it is not safe to use it inside of the function itself.

UPDATE:

I've done some research, and here's what I found. First of all, refer to ECMAScript 5 specification, section 13. There are two ways to define a function: a) using FunctionDeclaration, and b) using FunctionExpression.

They look very similar, but are also somewhat different. The following is a FunctionDeclaration:

function f() { };

These two are both FunctionExpression:

var x = function() { };
var y = function f() { };

The interesting part for us is about FunctionExpression. In case of var y = function f() { }, the identifier f is only visible from inside the function body. In other words, anywhere outside of { }, typeof f will return undefined.

Now it's time for some practical examples. Let's say we want to write a recursive function using FunctionDeclaration:

function f1(x) { x > 5 ? console.log("finished") : f1(x + 1) };

Now we want to "copy" the function to another variable and to set f1 to something else:

var f2 = f1;
var f1 = function() { console.log("Kitteh") };

But this doesn't work as expected:

f2(1); // outputs: Kitteh

Now if you used the Douglas Crockford's way of defining a recursive function:

var f1 = function myself(x) { x > 5 ? console.log("finished") : myself(x + 1) };

This way you can reassign the function to any variable as many times you want. At the same time, we ensured that the function always calls itself, and not some function assigned to the variable f1.

So the answer to the initial question: you define recursive functions in this manner, because it is the most flexible and robust way.

edited Sep 12 '12 at 12:23

answered Sep 10 '12 at 18:42

Dmytro Shevchenko

33,431
6
51
67

@HiroProtagonist it's basically elaborated in the comments from Crockford already ;) – madflow Sep 10 '12 at 18:45
not exactly author's preference, one way is defined at parse-time and other is defined at run-time: http://stackoverflow.com/a/336868/1431600 – Evandro Silva Sep 10 '12 at 19:07
1

@EvandroSilva I am aware of the difference. Still, which way to choose is mostly up to the developer. – Dmytro Shevchenko Sep 10 '12 at 19:46
1

@HiroProtagonist I edited my answer. I hope it makes it clear now. – Dmytro Shevchenko Sep 12 '12 at 12:09
you get the answer for re-search effort but given your research...but once again...why didn't he just use a function declaration `function use_this_name_only(){}` which has visibility from inside and outside itself and use it in both cases? – Sep 13 '12 at 00:11
...note your function declaration is defined at parse time which might be one reason...but I don't think so b.c. the ordering is such that it does not matter.... I still don't see the reason. – Sep 13 '12 at 00:17
@HiroProtagonist *"why didn't he just use a function declaration function use_this_name_only(){}"* – it was possible to do it like this, but doing it with two names and *FunctionExpression* is a best practice, apparently. Because it's more robust, which I showed in my examples. Maybe it's not clear from the examples? – Dmytro Shevchenko Sep 13 '12 at 07:53
I get it...that is actually a special form....that's what confused me...don't view it as combining two types of function creation...but just as a special form or actually I'd call it the third way to define a function....it essentially hides the function declaration 'part' into the inner scope...damn this was the longest question of all time..thanks for your patience. – Sep 13 '12 at 19:52
@HiroProtagonist I don't think it is a third form. It is a `FunctionExpression`. In a `FunctionExpression`, the given name is **always** private to the function - provided a name has been given, of course. It just looks very similar to `FunctionDeclaration`, that's what is confusing. – Dmytro Shevchenko Sep 13 '12 at 20:45
but in a simple function declaration...the name is not private to the function...do you agree with this? – Sep 13 '12 at 21:48
@HiroProtagonist yes, though I'm not sure what you mean by "simple". In a FunctionExpression, the name is private, in FunctionDeclaration it isn't. The specification of the standard, which I linked in my answer, sais: *"NOTE The Identifier in a FunctionExpression can be referenced from inside the FunctionExpression's FunctionBody to allow the function to call itself recursively. However, unlike in a FunctionDeclaration, the Identifier in a FunctionExpression cannot be referenced from and does not affect the scope enclosing the FunctionExpression."* – Dmytro Shevchenko Sep 13 '12 at 22:12
Your second sentence is incorrect. In a Function Expression the name is not private - `new someName = function(){}`...someName is not private. Maybe your definition of a function expression is different from mine. – Sep 16 '12 at 00:10
@HiroProtagonist `someName` is not the name of the function. It is the name of the variable that holds a reference to the function. Function itself is nameless, since you skipped the optional name. You can find a definition of `FunctionExpression` here: http://ecma262-5.com/ELS5_HTML.htm#Section_13 – Dmytro Shevchenko Sep 16 '12 at 07:33

score 1 · Answer 2 · answered Sep 10 '12 at 18:46

1

Because the walk function is recursive, it needs to be called both inside and outside the anonymous function scope:

var test = function test_recursive( num ) {
    if( num < 10 ) {
        test_recursive( --num );
    }
}

var num = test( 5 );

answered Sep 10 '12 at 18:46

Evandro Silva

1,392
1
14
29

2

Note that inside the function, if you write just `test` instead of `test_recursive`, it will work. – Alexandre Khoury Sep 10 '12 at 18:53

score 1 · Answer 3 · edited Sep 10 '12 at 23:50

1

walk_the_DOM is the way you should call the function from outside itself.

But if the function wants to use recursion, and to save bytes and/or to write faster, they call it walk.

EDIT: As Shedal pointed out, the variable walk_the_DOM could be reassigned later. So the function will not work any more.

edited Sep 10 '12 at 23:50

answered Sep 10 '12 at 18:46

Alexandre Khoury

3,896
5
37
58

Recursion | Two function names

3 Answers3

UPDATE:

Linked