Computer Architecture: How do applications communicate with an operating system?

Question

Prelude: This is admittedly a fairly broad question regarding computer architecture, but one that I hear from others and wonder about quite often myself. I also don't think that there is a direct or quick answer to this. However, I was hoping someone well-versed in systems architecture could provide some insight.

Some background: I am primarily a full-stack developer focusing mostly on web technologies and databases. I do have some background in C and tinkering with a good deal of low-level stuff, but that was a very long time ago and was non-academic. As such, I never got very deep into OS architecture, and this is one piece that eludes me. I am aware of various techniques and methods of accomplishing these tasks (especially on a higher level with technologies geared for this purpose), but am lacking a holistic picture/understanding of the low-level logistics of how this happens - particularly on an OS level.

The general question is: how do applications running inside of a "container" actually talk to the running instance of that container? By "container", I mean an instance of running code which is already loaded into memory (examples of such code could be an operating system, a graphics drawing interface, an application server, a driver, etc).

Also, this question applies only to compiled code, and to communication between systems running on the same machine.

For example

Let's say I build a simple library who's purpose is to draw a pixel on a screen. Let's also say this library has one method, drawPixel(int x, int y).

The library itself manages its own drawing context (which could be anything from a raw SVGA buffer to a desktop window). Applications using this API simply link dynamically against the library, and call the drawPixel method, without any awareness of the library's exact actions after the call.

Under the hood, this drawPixel method is supposed to draw to a window on the desktop, creating it if it doesn't exist on the first call.

However, technically what would happen if the setup was that straightforward & simple, is that each calling application would "pull & run" all of the code in drawPixel and its dependencies, effectively causing each running application to have its own running instance of the entire call chain (and thus, if it was called by 5 different applications, you'd end up with 5 different windows instead of a shared context to one window). (I hope I'm explaining this right)

So, my question is, how does this "sharing" happen in modern operating systems?

Would the code for drawPixel actually be replaced with IPC code? Or would it be regular graphics code, but somehow "loaded" into the OS in a way that there is one universally accessible running instance of it, which other applications call at-will?

Some cases I'm aware of

I know that there are many approaches to this issue, and am aware of a few of them. However, all of these seem to address specific niches and have shortcomings; none appear to be comprehensive enough to explain the incredible capabilities (regarding interconnectedness of OS & app services) of modern application ecosystems.

For example:

In the old (DOS) days, I believe app <-> OS communication was accomplished via system interrupts.
In the UNIX world, this is done via stdin/stdout pipes on the console, and a network protocol in X Windows.
There were IPC platforms like COM+/DCOM/DCOP/DBus on Windows & Linux, but again, these appear to be geared at a specific purpose (building & managing components at scale; predecessors of present-day SOA).

The question

What are some of the other ways that this kind of communication can be facilitated? Or, more specifically, how "is this done" in a traditional sense, especially when it comes to OS APIs?

Some examples of more specific questions:

How does a kernel "load" a device driver on boot, which runs its own code (in an isolated space?) but still talks to the kernel above it, which is currently running in memory? How does this communication happen?
How are windowing subsystems (with the exception of X and Quartz, which use sockets) talked to by applications? I think WIN32 used interrupts (maybe it still does?), but how does the newer stuff work? I'd be very surprised to find out that even in the present day, sophisticated frameworks like WPF or Metro still boil down to calling interrupts. I'm actually not sure that WIN32 APIs are even used by these systems.
What about lower-level graphics subsystems like GDI+ and the Linux Framebuffer?

Note: I think in the case of WIN32 (and possibly GDI+), for example, you get a pointer (handle) to a context, so the concept is effectively "shared memory". But is it as simple as that? It would appear pretty unsafe to just get a raw pointer to a raw resource. Meaning, there are things that protect you from writing arbitrary data to this pointer, so I think it is more complex than that.

(this might be a bit out of context as its JVM specific) How do servlets running inside an application server talk to the actual application server? Meaning, how do they load themselves "inside the context" of the currently running server?
Same question for IIS - How exactly is the plumbing set-up so that IIS can control and communicate back & forth with a separate process running an ASP.NET application?

Note: I am not sure if this question makes much sense and may admittedly be dumb or poorly-worded. However, I was hoping that my point came across and that someone with a systems background could chime in on the standard "way of doing things" when it comes to these scenarios (if there is such a thing).

Edit: I am not asking for an exhaustive list of IPC methods. There is a specific concept that I am trying to find out about, but I am not familiar with the correct terminology and so am having trouble finding the words to pinpoint it. This is why this question comes with so many examples, to "eliminate" the parts that the question does not target.

It is very broad and every question would find lots of answers and explanations with a simple web search. — Sami Kuhmonen, Jan 09 '16 at 06:44
@SamiKuhmonen I am aware of the high-level descriptions of the "techniques" used to accomplish this without Google. That's why I laid them out in my question. However, the underlying principle is not one I was able to find anywhere. — Ruslan, Jan 09 '16 at 06:49
Each language has its own compiler/runtime-environment which is setup to interact with OS by using system calls of the underlying OS. I am not an expert, but, this question can't be answered here as it is `(too)^n broad`, where n->a very large value. I hope this point is itself sufficient here for you to start searching on Google/web. — Am_I_Helpful, Jan 09 '16 at 06:54
@Am_I_Helpful I'm aware of the first point; my question was more about languages without sophisticated runtimes (or, in the case of those languages, how the actual plumbing is set up in the runtime. Most if not all of these runtimes still come down to C API calls). Also, I think there can be an answer; the reason I provided examples in my post is to try to narrow it down to the specific concept that I'm talking about (which I don't know the name of, hence such a long post) — Ruslan, Jan 09 '16 at 07:03
BTW, my understanding is that you are legally not allowed to understand how Windows work in the details (because it is proprietary software), unless you buy some very expensive source license and sign some NDA. This is why I prefer using free software — Basile Starynkevitch, Jan 09 '16 at 07:47
@BasileStarynkevitch Interesting... Actually, I thought this is something that people learn in any undergrad OS class (which I never took, so I wouldn't know), which is why I was embarrassed to even ask this. But no, you're allowed to understand how Windows works. Mark Russinovich actually built a whole company (WinInternals) around very deep knowledge of undocumented areas of Windows, before Microsoft bought it in 2006 and made him CTO of Azure. — Ruslan, Jan 09 '16 at 07:58
But did WinInternals publish its knowledge? Would it have been allowed to do that? — Basile Starynkevitch, Jan 09 '16 at 08:02
Hm, not sure... I'll have to look into that. I've definitely come across a lot of blogs and articles (even in magazines) that cover undocumented Windows functionality (especially a few years ago when I was debugging a memory leak in IIS), so it is definitely done and is probably safe from a practical angle. But whether it is actually "legal" and something that Microsoft could pursue you for if they wanted - I'm not sure. — Ruslan, Jan 09 '16 at 08:05

Basile Starynkevitch · Accepted Answer · 2017-10-18T14:57:57.453

Too broad question, but some points (related to Linux; the principles should be the same for Windows, but you probably are forbidden to understand all of it) :

The elementary system calls (those listed in syscalls(2)...) are invoked by an elementary machine instruction (e.g. SYSENTER or SYSCALL) which switches the processor into kernel mode (with the system call number and arguments passed through defined registers, following the ABI convention). Hence user-space code can be viewed as running in some virtual machine (defined by user-mode instructions + the system call primitives). BTW the Linux kernel can load kernel modules to e.g. add additional code (such as device drivers) in it, and that is done also thru system calls.

The inter-process communication facilities are built above these system calls (perhaps used by the standard library in higher level functions, e.g. getaddrinfo(3) might interact indirectly with some DNS service, see nsswitch.conf(5)). Read Advanced Linux Programming for more details. In practice you'll need several server programs (and that idea is pushed to its extreme in microkernel approaches), notably (on recent Linux) systemd. Drivers and kernel modules are loaded by specific system calls and later are part of the kernel so are usable thru other system calls. Play with strace(1) to understand the actual system calls done by some Linux program. Some information is provided by the kernel thru pseudo file systems (see proc(5)...) accessible thru system calls.

Every communication from user program to kernel is done by IPC (implemented by system calls). Sometimes, the kernel is doing an upcall to user code (on Linux, with signals).

The Linux framebuffer (and the physical keyboard & mouse) is generally only accessed by a single server which other desktop applications communicate with using usual IPC facilities -sockets-, that server is the X11 or Wayland server.

Read also some good book on Operating Systems, e.g. the freely downloadable Operating Systems: Three Easy Pieces

For Windows, MacOSX, Android, it is very similar. However, since Windows (etc...) is a proprietary software, you might not be able to know all the details (and you might not be allowed to reverse-engineer them). In contrast, Linux is free software, so you can study its source code.

My advice would be to understand in details how Linux work (this would take several years) and study some relevant source code (which is possible for free software). If you need an deep understanding of Windows, you might need to buy some source code license of it (probably millions of dollars) and sign an NDA. I don't know Windows at all, but AFAIK it is only defined by a huge API in C. Rumors tell that the Windows kernel is microkernel like, but Microsoft has economical interest to hide ugly implementation details.

Computer Architecture: How do applications communicate with an operating system?

For example

Some cases I'm aware of

The question

1 Answers1

Linked