How system calls are handled in Linux on ARM machine

Question

I have some doubt regarding system call in Linux on ARM processor.

In ARM system calls are handled in SWI mode. My doubt is do we perform entire required work in SWI mode or only part of that work is done in SWI mode and then we move to some process context? As per my understanding some system calls can take significant time and performing that work in SWI is not a good idea.

Also how do we return to calling user process? I mean in case of non-blocking system call how do we notify the user that required task is completed by system call?

The `SWI` is also called `SVC`. Related: [Mode SVC handler starts in](http://stackoverflow.com/questions/9044258/which-mode-does-the-svc-handler-start-in), [Linux ARM system call](http://stackoverflow.com/questions/11257186/linux-system-call), [System call in ARM](http://stackoverflow.com/questions/12946958/system-call-in-arm), [Linux process context and SVC](http://stackoverflow.com/questions/23406171/linux-process-context-and-svc-call-in-arm). Finally, I answer stacks, etc in [ARM Linux exception stacks](http://stackoverflow.com/questions/22928904/linux-kernel-arm-exception-stack-init). — artless noise, Oct 15 '14 at 16:37
At least the final link gives the answer that Linux always switches to supervisor mode. Here, the active task has a kernel stack and `thread_info` anchored by the `r13/sp` register. On a context switch, all register atomically update (to CPU) via `ldm`. For the Linux kernel, the `SWI` is already in *supervisor mode* and the code is in [entry-common.S](https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/arch/arm/kernel/entry-common.S#n352) — artless noise, Oct 15 '14 at 16:46
Thanks for your reply. My doubt is do we handle system call in some process context or it is in SVC context? Also please let me know how do we return to calling function of user process once system call is finished? — user2806748, Oct 16 '14 at 06:20
The `swi` in the user process causes an exception. During all exception, Linux saves context on the *supervisor mode* stack. A context switch may result in which case, the supervisor stack changes. The new supervisor stack has the new user space registers saved upon it. Also anchored in the supervisor stack is a `thread_info` which provides **MM** (memory management) info to update the MMU. This is answered in the other questions, if you read them. — artless noise, Oct 16 '14 at 14:26

score 1 · Answer 1 · answered Oct 17 '14 at 01:41

I think you're missing two concepts.

CPU privilege modes and use of swi are both an implementation detail of system calls
Non-blocking system calls don't work that way

Sure, under Linux, we use swi instructions and maintain privilege separation to implement system calls, but this doesn't reflect ARM systems in general. When you talk about Linux specifically, I think it makes more sense to refer to concepts like kernel vs user mode.

The Linux kernel have been preemptive for a long time now. If your system call is taking too long and exceeds the time quantum allocated to that process/thread, the scheduler will just kick in and do a context switch. Likewise, if your system call just consists of waiting for an event (like for I/O), it'll just get switched out until there's data available.

Taking this into account you don't usually have to worry about whether your system call takes too long. However, if you're spending a significant amount of time in a system call that's doing something that isn't waiting for some event, chances are that you're doing something in the kernel that should be done in user mode.

When the function handling the system call returns a value, it usually goes back to some sort of glue logic which restores the user context and allows the original user mode program to keep running.

Non-blocking system calls are something almost completely different. The system call handling function usually will check if it can return data at that very instant without waiting. If it can, it'll return whatever is available. It can also tell the user "I don't have any data at the moment but check back later" or "that's all, there's no more data to be read". The idea is they return basically instantly and don't block.

Finally, on your last question, I suspect you're missing the point of a system call.

You should never have to know when a task is 'completed' by a system call. If a system call doesn't return an error, you, as the process have to assume it succeeded. The rest is in the implementation details of the kernel. In the case of non-blocking system calls, they will tell you what to expect.

If you can provide an example for the last question, I may be able to explain in more detail.

Thanks a lot for detailed description. Let me make write my doubts more clearly. — user2806748, Oct 18 '14 at 06:36
Let me make write my doubts more clearly. If I call some system call "read" to read file data in use space. How can I know whether it is blocking call or non-bloacking call? Also as per your following statement Non-blocking system calls are .... How actual data will be returned in case of non-blocking call. If this calls returns instantly and might not return actual data then we can take some decisions in user space based on this incorrect data, which is not expected. So can you please explain me how both blocking and non-blocking calls works and how they returns data to user space process. — user2806748, Oct 18 '14 at 10:36
Also As per my understanding system calls are not handled in any process context? So there should not be any quantum allocation like thing. Please correct me if I am wrong. If system calls are handled in process context then which process it will be? Do we have some dedicated process to handle system call in linux kernel? — user2806748, Oct 18 '14 at 10:37
System calls *are* handled in the process context. You can't run a system call handler *and* the user space code under the *same thread* concurrently. As to whether a `read()` call will block depends how you set the options of the descriptor that's passed to it. By default, `read()` will block unless you specifically tell it not to. Furthermore, a non-blocking system call will *not* return invalid data. It will either return some error code along the lines of "no data at the moment, check back later" or partial data - i.e. "here's some data but there may be more coming". — tangrs, Oct 19 '14 at 00:18
Have a look at the [C10K problem](http://www.kegel.com/c10k.html#strategies) for one of the main applications of non-blocking system calls. You can also have a look at [this SO answer](http://stackoverflow.com/a/5616129/268025) for a more concrete example. — tangrs, Oct 19 '14 at 00:20
I believe that user2806748 would benefit from a book recommendation... I get the feeling that going back to first principles would help. — nonsensickle, Feb 27 '15 at 06:08

How system calls are handled in Linux on ARM machine

1 Answers1