1

First of all, I don't know about modern CPUs and operating systems. For this reason, I will explain my explanation over the intel 8085 processor. And of course, I would like you to imagine that there is an operating system that can run on the intel 8085.

we have such assembly code:

MVI A,16
MVI B,16

ADD B

HLT

This code is very simple. When this code runs, it does the following: It loads the number 16 into registers a and b of the intel 8085 processor. And then it adds the value of these two registers.

Of course, when we try to run this code in our operating system, most likely nothing will happen.

What I want to ask is: How can I run a code that does not contain any system calls (or anything operating system specific) on the operating system (by-passing the operating system)? And I don't want the operating system to crash while doing this.

suiciyom
  • 21
  • 5
  • 2
    You can of course run code that does not contain any system calls (although typically you do have an exit system call at the end, but some operating systems allow you to simply return). On architectures that support "proper" operating systems, you normally have privilege levels, so your applications can not crash the OS, but you are limited in what instructions you can use. Applications of course do not bypass the OS but unless you try to do naughty things the OS does not interfere. `MVI A, 16` would work just fine and load the `16` into `A`. – Jester Oct 04 '22 at 22:45
  • @Jester Thanks for your comment. So how do I send the code to the processor? – suiciyom Oct 04 '22 at 22:58
  • 1
    You put it into an executable file that your OS supports. – Jester Oct 04 '22 at 22:59
  • 1
    *And I don't want the operating system to crash while doing this.* - Then you'll have to know which areas of memory the OS reserves, and not overwrite them. That will depend on the OS. Since this is 8085, not 80286 or later, the CPU doesn't have a "protected mode" the OS can use to *stop* user-space from messing up the OS while it runs directly on the CPU. – Peter Cordes Oct 05 '22 at 00:23
  • 2
    I don't think `HLT` would be allowed in user mode, so that either would crash the program with illegal instruction or halt the program just like a system call, or crash/halt the operating system (though you haven't said with OS you're bypassing). – Erik Eidt Oct 05 '22 at 00:25
  • 1
    @ErikEidt: 8085 doesn't have modes. It's a minor extension of 8080. Assuming it's like 8086's `hlt`, it just waits for the next interrupt, so no big deal if you haven't also disabled interrupts. (But yes, it's a privileged instruction on x86; user-space is expected to return control to the OS instead of putting the whole CPU to sleep itself.) – Peter Cordes Oct 05 '22 at 03:55

2 Answers2

5

What I want to ask is: How can I run a code that does not contain any system calls (or anything operating system specific) on the operating system (by-passing the operating system)? And I don't want the operating system to crash while doing this.

Simple computers (such as computers using the 8085 CPU)

On such computers, the operating system is simply a set of functions that are called - either using the call operation or using specific system call operations (on 8086 CPUs: int x).

If you call int 10h (as an example) on an MS-DOS computer, a function (the so-called interrupt handler) is called. This function accesses the graphics card using the out operaration and the video RAM (which is written like any normal RAM; on the 8085, the corresponding instruction would be named stax d).

Your program can "simply" do the same steps the operating system (the interrupt handler) would do instead of calling int 10h.

However, it is not really "simple": I think on an MS-DOS computer, writing some text to the screen might take about 300 instructions (depending if you want to support cursor movement, scrolling, line break handling etc...).

Modern desktop computers (running a modern OS)

Simple answer: You can't.

Modern CPUs have security and protection features:

Such CPUs contain a register that contains the information if the application, an interrupt or the OS is running.

When the application is running, the CPU knows that the application is running; when the application performs a system call (for example using the int x instruction), the CPU knows that the OS is now running.

The protection features of the CPU (such as the memory management unit) do not allow you to access the hardware (such as the display) when the application is running.

For this reason, you must do a system call if you want to access the hardware. (Because the OS is allowed to access the hardware.)

Martin Rosenau
  • 17,897
  • 3
  • 19
  • 38
2

Clearly you are running an operating system, so why not try it yourself?

Processors only know how to run their instruction set and there are countless programming languages you can use to generate those instructions, including assembly language.

so.c

int fun ( int );

int main ( void )
{
    return(fun(5));
}

fun.c

int fun ( int x )
{
    return(x+3);
}

./so.elf ; echo $?
8

so

objdump -d fun.o

fun.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <fun>:
   0:   8d 47 03                lea    0x3(%rdi),%eax
   3:   c3                      retq   

So now let's use assembly language:

.globl fun
fun:
lea 0x7(%rdi),%eax
retq

./so.elf;  echo $?
12

But now let us move it up a level:

int main ( int argc, char *argv[] )
{
    return(argc+7);
}


0000000000001040 <main>:
    1040:   8d 47 07                lea    0x7(%rdi),%eax
    1043:   c3                      retq   

Switching architectures, because I feel like it...

.globl _start
_start:
adds r0,#4
bx lr

and I get a seg fault (same with x86 doing a retq). As pointed out in the comments some will let you just return but you probably have to make an exit system call. Note I did not crash the operating system. I just crashed that program.

.globl _start
_start:
adds r0,#4
mov r7,#1
swi #0

./fun.elf; echo $?
4

Assembly language is not the real problem but getting the program into the OS and running it. And how you do that normally is creating a binary file that is a file format supported by the operating system (including various linker specific things as to memory space, entry point, etc). Otherwise you have to try to hack the operating system to shove it into memory and then convincing the operating system to run that code.

You could create or choose an operating system for your 8085 that allows a simple return as a way to exit a program, or modify the operating system to allow that, then you can just perform a return from subroutine/function call and it is just a few pure instructions with no system calls.

There is no magic to assembly language, just more freedom. The processor can only run its instructions, it does not know how to run C language or C++, or rust, or python, etc. Just its own instructions. And assembly language is just one programming language you can use. Most of the problem is not the program but the file format, the operating systems rules, and how to exit cleanly.

halfer
  • 19,824
  • 17
  • 99
  • 186
old_timer
  • 69,149
  • 8
  • 89
  • 168
  • Indeed, in Linux there is no valid return address anywhere in a fresh process; registers hold garbage (actually zero), the stack pointer points at `argc` as documented in the System V ABI. (e.g. x86-64 aka AMD64 System V psABI). [Nasm segmentation fault on RET in \_start](https://stackoverflow.com/q/19760002) / [What happens if there is no exit system call in an assembly program?](https://stackoverflow.com/q/49674026) – Peter Cordes Oct 05 '22 at 03:59
  • A few OSes do start a fresh process such that a function-return instruction will trigger an exit system call, notably MS-DOS for x86 where the stack holds the address of an `int 20h` instruction, or equivalent code that eventually runs that, I forget. – Peter Cordes Oct 05 '22 at 04:01
  • 2
    @PeterCordes: Flat-format .COM files (not MZ executables with a header) start with `ss:sp` pointing at a zero word on the stack. (Usually at `sp` = FFFEh, the end of the 64 KiB segment, but not always so if less memory is available. Eg when using LH and no big UMB is available.) Because they also start with `cs:ip` = PSP:100h the `retn` instruction will jump to `cs:0` which is the start of the PSP. And PSPs start with an `int 20h` instruction in order to support this use case to terminate on a return. – ecm Oct 15 '22 at 16:33