3

I've been coding in scripting languages like Lua recently and the existence of anonymous inner functions has got me thinking. How can a language implemented with C, like Lua, have inner functions while in C, no matter what you do,you cannot escape that fact that functions must be declared well beforehand during compile time? Does this mean that in C, there is actually a way to achieve inner functions, and it's simply a matter of implementing a huge code base to make them possible?

For example

void *block = malloc(sizeof(1) * 1024); // somehow 
// write bytes to this memory address to make it operate
// like an inner function?
// is that even possible?
char (*letterFunct)(int) = ((char (*letterFunct)(int))block;
// somehow trick C into thinking this block is a function?
printf("%c\n", (*letterFunct)(5)); // call it

What is the key concept I'm missing that bridges this gap in understanding why some languages with advanced features (classes, objects, inner functions, multi threading) can be implemented in languages devoid of all of those?

ks1322
  • 33,961
  • 14
  • 109
  • 164
Hatefiend
  • 3,416
  • 6
  • 33
  • 74
  • 2
    I don't know Lua, but I suspect that it *doesn't* work by generating C code from your Lua code. So the existence (or lack) of C syntactic features is irrelevant. – Oliver Charlesworth Feb 19 '18 at 20:37
  • @OliverCharlesworth Lua was just an example. Surely there are languages written in regular C which have non-C features. How does that exchange work? – Hatefiend Feb 19 '18 at 20:40
  • Not familiar with lua, but just like Python, its function is a sophisticated C structure, and the function body is **not** low level instruction, it's some high level byte codes to be interpreted in a loop. – llllllllll Feb 19 '18 at 20:40
  • Same principle applies. Unless the model is that the user's source code is first converted to C source code, it doesn't matter what syntax C has. – Oliver Charlesworth Feb 19 '18 at 20:41
  • C has all the syntactic elements to make your example work. The big roadblock is OS security, which won't let you mark `block` as executable. – user3386109 Feb 19 '18 at 20:41
  • @liliscent Structures must also be declared beforehand just like functions. How would that work. – Hatefiend Feb 19 '18 at 20:43
  • @Hatefiend All functions belong to the same structure, the difference is the content inside the structure, **not** the structure itself. As a simplified model, you can store all byte codes of a function to a `char*`. – llllllllll Feb 19 '18 at 20:45
  • I'm not sure Einstein's children are all geniuses. The same principle holds for your question :) – Aif Feb 19 '18 at 20:46
  • Note that `((char (*letterFunct)(int))block` is undefined behavior (UB) in C. C does not specify that an `void*` can be certainly converted to a function pointer type. So the rest of code is moot. – chux - Reinstate Monica Feb 19 '18 at 20:54
  • Note that gcc has an extension, allowing [Nested Functions](https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html). – ks1322 Feb 19 '18 at 21:19
  • It is not that it is *impossible* to have [nested functions in C](https://stackoverflow.com/questions/2608158/nested-function-in-c). It's just that the language standard doesn't say they should exist. – Bo Persson Feb 19 '18 at 21:19
  • 1
    C is written in assembler. (It's not, but bear with me for a moment.) C has structures. Assembly language does not have structures. How can this be? – Steve Summit Feb 19 '18 at 22:48

5 Answers5

5

Just because the compiler / interpreter for a particular language is written in C doesn't mean that that language has to translate into C and then get compiled.

I don't know about Lua, but in the case of Java the code is compiled to Java Byte Code, which (loosely speaking) a Java VM reads and interprets.

The original C compiler was written in assembly, and the original C++ compiler was written in C, so it's possible to write a compiler for a higher-level language in a lower-level language.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • I do understand this concept but what about languages executed at the C level. Are they forbidden from having inner functions / non-C features? – Hatefiend Feb 19 '18 at 20:48
  • 2
    @Hatefiend What exactly does "executed at the C level" mean, and which languages do you think fit that definition? – dbush Feb 19 '18 at 20:49
  • To be honest I'm not quite sure. In Lua's source I can go into every aspect of the language and inspect the C files that make everything work. I assume that means that those C files need to figure out a way how to turn my scripting code into valid C code. How that happens with advanced features like inner functions is what my question is about. – Hatefiend Feb 19 '18 at 20:52
  • 1
    C is a "Turing Complete" language which means that you can do anything you want with it, though as with any Turing Complete language, *expressing* that anything might be far from easy. You can write object-oriented C, for example, and you can create lambdas, but usually people who want those things steer towards languages that have native support for such things so expressing that code is easier. – tadman Feb 19 '18 at 20:54
  • Remember at the end of the day C compiles down to machine code just like many other languages, but that doesn't mean all languages are equally good at expressing that same machine code. Each langauge has limitations, and we choose the language based on the limitations we're prepared to deal with because of the benefits we get. – tadman Feb 19 '18 at 20:55
  • 1
    @Hatefiend A compiler, just like any program, operates on a set of data and generates output based on it. In the case of a compiler, it reads program instructions and outputs a set of machine instructions to do what the user requested. How it does that is not trivial, but it can be done. – dbush Feb 19 '18 at 20:57
  • 1
    @Hatefiend - "I assume that means that those C files need to figure out a way how to turn my scripting code into valid C code" - that's not correct. Lua likely implements a *virtual machine*, but alternatives include an interpreter, or a compiler straight down to native machine code. None of these require turning your script into C. – Oliver Charlesworth Feb 19 '18 at 20:57
  • ... and the original assembler was written with [switches](https://digital.com/wp-content/uploads/1024px-DEC_PDP-11_20_computer_at_the_Computer_History_Museum.jpg). Have to start somewhere. – chux - Reinstate Monica Feb 19 '18 at 20:58
  • @chux, Front panel switches were one place where you could start, but not the only place: https://en.wikipedia.org/wiki/Core_rope_memory https://en.wikipedia.org/wiki/Diode_matrix http://www.computerhistory.org/revolution/mainframe-computers/7/164/578 – Solomon Slow Feb 19 '18 at 22:18
3

Closures and inner functions usualluy involve passing an extra (hidden) argument to the function that holds the environment of the closure. In C you don't have those hidden extra arguments, so to implement closures or inner functions in C, you need to make those extra arguments explicit. So to implement an "inner function" in C, you might end up with something like:

struct shared_locals {
    // locals of function shared with inner function
};

int inner_function(struct shared_locals *sl, /* other args to inner function */...) {
    // code for the inner function -- shared locals accessed va sl
}

int function(...) {
    struct shared_locals sl;  // the shared locals

    // call inner function directly
    inner_function(&sl, ...);

    // pass inner function as a callback
    func_with_callback(inner_function, &sl);
}

The above kind of code is why 'callbacks' in C code usually involve both a function pointer and an extra void * argument that is passed to the callback.

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
1

To implement inner functions, you need closures. To implement closures you need some more advanced mechanism of allocating local variables than just the stack. C was meant to be a lightweight language, so advanced concepts like closures and garbage collectors were excluded.

C++ is kind of extension of C that has all the advanced concepts, closures and inner functions included.

Your example with a memory block that you fill with assembler code: You can do that, but it will not be portable. It would require cooperation from operating system and from the compiler. The only portable solution I can think of would be embedding the compiler into every executable, which is again too much.

And this would still be a normal non-inner function. To implement inner functions compiled at runtime you would again need closures.

haael
  • 972
  • 2
  • 10
  • 22
  • [Objective-C](https://en.wikipedia.org/wiki/Objective-C) is also a rethinking of how C works which implements similar features, but has a completely different philosophy as to how. – tadman Feb 19 '18 at 20:56
  • A stack should be sufficient for implementing closures ;) – dualed Feb 23 '18 at 07:40
1

Lua is a program---just like any other program---you run it, it reads input, it produces output. When you run lua MyProgram.lua, the lua program reads from the file MyProgram.lua, and it writes output to the console. As with many other programs, what it spits out depends on what it read in.

The lua program is written in C.

If your MyProgram.lua file contains print("x") at top level, then when the lua program reads that line it will print x.

Note: It was lua that printed x. It wasn't really MyProgram.lua. MyProgram.lua is just a data file. The lua program reads it in, and it uses the data to decide what it's supposed to do.

When the lua program reads that line, it doesn't "translate" the line into C or into any other language. It just does what the line says to do. It prints x.

There's a name for that: We say that the lua program interprets MyProgram.lua.

Note: I lied. The lua program doesn't really do anything. The lua program is just a data file. When you type lua MyProgram.lua, the computer reads the data into memory, and then it uses the data to decide what it is supposed to do.

When we talk about a computer system, we speak at different levels of abstraction. When we say, "the computer hardware did X," we are speaking about a low level of abstraction. When we say, "MyProgram.lua did Z", we are speaking about a higher level of abstraction. And, when we say that the lua interpreter did something, we are talking about a level somewhere in-between.

In between the hardware and the end user's experience, you can find many levels of abstraction if you look deep enough.

But, back to Lua...

If your MyProgram.lua contains function p() print("y") end at top level, then the Lua program doesn't do anything with that right away. It just remembers what you wanted p() to mean. Then later, if it sees p() at top-level, then it prints y.

You could write the program that does those things (i.e., you could write Lua) in almost any language. Your choice of what language you used to implement lua might affect the internal architecture of your Lua interpreter, but it need not limit the language that your interpreter understands (i.e., the Lua language) in any way.

Solomon Slow
  • 25,130
  • 5
  • 37
  • 57
  • Thank you for the write up. I definitely understand what is meant by an interpreted language now. Still, `The lua program is written in C` is the part that doesn't make sense to me. When the Lua C program parses over an inner function, somehow its programmed in such a way, in the C environment, to handle the abstract idea of nested functions. How? It's that layer that I don't understand. – Hatefiend Feb 20 '18 at 00:14
  • @Hatefiend -- I don't understand the confusion. Functions are an abstraction, and Lua functions are not C functions. – ad absurdum Feb 20 '18 at 00:35
  • At some point, a Lua/Python/etc interpreter needs to understand and control nested Lua/Python/etc functions, and all it has to work with is the regular C environment. How does it do this? Eg. the interpreter reads the next line and the user wants to make a nested function. How does/could the C interpreter handle such a request? – Hatefiend Feb 20 '18 at 01:52
  • @Hatefiend, There is no C interpreter. C is a compiled langage: The C compiler reads in a C-language program, and it outputs a _native executable program_. That is, a program that can be interpreted by the computer hardware. But that doesn't matter, because... – Solomon Slow Feb 20 '18 at 14:19
  • 1
    ...A Lua function is just a data object in the Lua interpreter (i.e., in the C program). The Lua interpreter does _not_ translate Lua functions into C functions. When the Lua interpreter sees `function(...) ... end` at top-level, it creates a new function object that records the statements in the function body. Then later, when Lua sees a call to that function, it creates a new _function activation_ (a place to hold the function's arguments and local variables) and it interprets the statements in the body of the function in the same way that it interprets top-level statements. – Solomon Slow Feb 20 '18 at 14:25
  • ...By the way: What you call "nested" functions in Lua really should be called "closures." The `function(...)...end` construct in Lua is an [expression](https://en.wikipedia.org/wiki/Expression_(computer_science)), just like `5` is an expression and `"seven"` is an expression and `{x="seven", f=5}` is an expression. The value of `function(...)...end` is a [closure](https://en.wikipedia.org/wiki/Closure_(computer_programming)) that captures the statements in the body of the function, plus all of the _free variables_ to which those statements refer. – Solomon Slow Feb 20 '18 at 14:39
  • ...For more reading: https://en.wikipedia.org/wiki/Interpreter_(computing) – Solomon Slow Feb 20 '18 at 14:43
1

You're confusing the C source code with the binary executable. The Lua interpreter (the program that reads and runs Lua scripts) is written in C. But after it's compiled, it's not C anymore. It would behave the same if it were written in Fortran (assuming it compiled to the same binary CPU instructions).

There's no such thing as a "running C environment". There are only binary machine instructions. The CPU doesn't know C anymore than it knows French.

As far as how Lua handles inner functions, the designers of Lua sat down and figured out all of the context they would need to keep track of whenever the interpreter encounters an inner function, and wrote the code to assemble and keep track of that context for as long as the inner function is viable. The inner function is a specifically Lua construct -- it has nothing to do with C, because when the Lua interpreter is running, there is no C anywhere.

Mark Benningfield
  • 2,800
  • 9
  • 31
  • 31