5

Im a mid-level(abstraction) programmer, and some months ago i started to think if i should reduce or increase abstraction(i've chosen to reduce).

Now, i think i've done most of the "research" about what i need, but still are a few questions remaining.

Right now while im "doing effectively nothing", im just reinforcing my C skills (bought "K&R C Programing Lang"), and im thinking to (after feel comfortable) start studying operating systems(like minix) just for learning purposes, but i have an idea stuck in my mind, and i don't really know if i should care.

In theory(i think, not sure), the higher level languages cannot refer to the hardware directly (like registers, memory locations, etc...) so the "perfect language" for the base would be assembly.

I already studied assembly(some time ago) just to see how it was (and i stopped in the middle of the book due to the outdated debugger that the book used(Assembly Language Step By Step, for Linux!)) but from what i have read, i din't liked the language a lot.

So the question is simple: Can an operating system(bootloader/kernel) be programmed without touching in a single line of assembly, and still be effective?

Even if it can, it will not be "cross-architecture", will it? (i386/arm/mips etc...)

Thanks for your support

SOMN
  • 351
  • 1
  • 3
  • 15
  • 1
    With just C, how would one first change out of [Real Mode](http://en.wikipedia.org/wiki/Real_mode) on an x86? What about making a [BIOS interrupt](http://en.wikipedia.org/wiki/BIOS_interrupt_call) call? –  Aug 16 '12 at 07:17
  • I don't know. I already studied the memory models when i've read the Jeff Duntemann's book, but i don't really know how to apply them. – SOMN Aug 16 '12 at 07:22
  • Such operations are outside the scope of the C language specification / stdlib. Ergo .. –  Aug 16 '12 at 07:23
  • Put it this way - the processor always starts up by fetching its first instruction from some hardware-defined address in some hardware-defined mode. Data/stack segment registers, stack pointer etc. are probably pointing to illegal/non-existent memory and need to be initialized with valid values. No periperal chips are working - there is no memory management, no interrupts, no timers, no nothing but boot code. Realistically, it's gonna be assembler. – Martin James Aug 16 '12 at 09:10
  • @MartinJames Thanks for your answer. In fact it makes a lots of sense that way ;) – SOMN Aug 17 '12 at 09:49

3 Answers3

4

You can do a significant amount of the work without assembly. Linux or NetBSD doesnt have to be completely re-written or patched for each of the many targets it runs on. Most of the code is portable and then there are abstraction layers and below the abstraction layer you find a target specific layer. Even within the target specific layers most of the code is not asm. I want to dispell this mistaken idea that in order to program registers or memory for a device driver for example that you need asm, you do not use asm for such things. You use asm for 1) instructions that a processor has that you cannot produce using a high level language. or 2) where high level language generated code is too slow. For example in the ARM to enable or disable interrupts there is a specific instruction for accessing the processor state registers that you must use, so asm is required. but programming the interrupt controller is all done in the high level language. An example of the second point is you often find in C libraries that memcpy and other similar heavily used library functions are hand coded asm because it is dramatically faster.

Although you certainly CAN write and do anything you want in ASM, but you typically find that a high level language is used to access the "hardware directly (like registers, memory locations, etc...)". You should continue to re-inforce your C skills not just with the K&R book but also wander through the various C standards, you might find it disturbing how many "implementation defined" items there are, like bitfields, how variable sizes are promoted, etc. Just because a program you wrote 10 years ago keeps compiling and working using a/one specific brand of compiler (msvc, gcc, etc) doesnt mean the code is clean and portable and will keep working. Unfortunately gcc has taught many very bad programming habits that shock the user when the find out they didnt know the language a decade or so down the road and have to redo how they solve problems using that language.

old_timer
  • 69,149
  • 8
  • 89
  • 168
  • I also recommend you study some asm, NOT X86! start with something else. arm, thumb, msp430, avr, and some others. Do mips later only because mips also has a different way of doing a few things that is perfectly fine but different than the rest of the world, so better to learn one way and not have to spend your life translating from the corner case to the norm. – old_timer Aug 16 '12 at 19:55
  • 2
    Why not x86? I only have x86 hardware. Is qemu an efficient solution? What you mean with gcc? That it often updates things inside languages, making them different? Was that what you meant? Is there any free-resource(or any good well known book) to learn (eg.) ARM ASM? Whats the main difference between those arch's you said? Wikipedia says mainly, that the difference is between RISC and CISC , but there are many RISC and many CISC. – SOMN Aug 17 '12 at 00:29
  • "NOT X86!" is overstating a bit. It's definitely quirky, and its complexity does make it kinda suck to learn for your first assembly language. But it's also far more accessible than anything else on the planet right now. Experimenting with it requires only an assembler ([free](http://yasm.tortall.net/)), an existing OS (which you already own), and an x86 processor (which you already own -- often even in convenient dev-machine form -- unless you're dirt poor, masochistic, or a hipster). If you're playing with operating systems, a VM ([also free](https://www.virtualbox.org/)) might help too. – cHao Apr 02 '14 at 16:50
  • there are more tools and simulators that are equally available for non-x86 as x86 and you dont get into vm problems where it tries to limit to what you are running on. looking at the kinds of things that folks get hung up on when learning at this level, the x86 limitations and quirks and more important trying to run on hardware rather than a simulator leads to a high percentage of failure and giving up. one less person exposed to this world forever simply because they choose poorly as to where to start. – old_timer Apr 02 '14 at 17:57
  • Not sure what VM problems you're talking about; i haven't run into any yet. I just know that an emulator, even a perfect one, is just not the same for me. It never gives me that thrill that i get when something i made really runs for the first time on a real machine. That feeling is my motivation; without it, i lose interest *very* quickly. And in order to get it with ARM, i'd have to either buy more stuff or risk bricking my phone. Whereas with x86, i can just run a program -- or for an OS, reboot -- whenever i want. – cHao Apr 04 '14 at 21:33
  • I am sorry that you dont understand, if you were to learn a handful more instruction sets it would be painfully obvious. Cost you nothing but a handful of evenings of time...Or maybe $50 in hardware if you must... – old_timer Apr 05 '14 at 01:39
  • @dwelch: I *understand*; RISC is more straightforward. Easier to code, even. But that alone doesn't make it inherently easier to learn. For me, real-world applicability is a *huge* factor. I've written asm for 8088, x86, 6809, 6502, 65816, and a couple of others, and straight binary for 6809 and 8088. They were easy to learn because they were immediately useful; i had real computers using those processors. OTOH, although i farted around with ARM and MIPS, i never got past the farting-around stage, because the real-world applicability wasn't there -- they were not useful to me on my hardware. – cHao Apr 07 '14 at 22:09
2

You have answered your question yourself in "the higher level languages cannot refer to the hardware directly".

Whether you want it or not, at some point you will have to deal with assembly/machine code if you want to make an OS.

Interrupt and exception handlers will have to have some assembly code in them. So will need the scheduler (if not directly, indirectly). And the system call mechanism. And the bootloader.

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
  • So, when there are people arround the internet asking if some language is suitable to make an OS (C++,VB,JAVA,C#,etc...) they must also use assembly to make the lower level? – SOMN Aug 16 '12 at 07:26
  • 2
    @Claudiop Yes. Everything needs a "bootstrap". It *technically* doesn't have to be Assembly (as one could hypothetically generate the machine instructions by other means), but that is by far the "easiest"/accepted route .. –  Aug 16 '12 at 07:26
  • "assembly/machine code" This is a "xor" or an "and"? If assembly is "ugly", i can't really imagine machine code... – SOMN Aug 16 '12 at 07:30
  • 2
    No, machine code is beautiful. It consists only of numbers, no "xor", no "and", nothing like that. And numbers are beautiful. Unless you're thinking of [ugly numbers](http://stackoverflow.com/questions/4600048/nth-ugly-number). Just kidding. :) You should read up on assembly language and machine code. – Alexey Frunze Aug 16 '12 at 07:41
  • machine code and the assembly that is normally used to generate the machine code is almost always assumed. When you say C/C++, or other languages there is asm involved. Not JAVA you say it is a vm? The vm is written in some language targeted at the machine meaning asm to machine code or machine code directly. Compiling direct to machine code is the exception not the rule as it is much harder to debug. – old_timer Aug 16 '12 at 19:36
  • @AlexeyFrunze Thats not what i meant. I meant "assembly and machine code or assembly xor(exclusive or) machine code?" Ok. I already undestood what you guys tried to teach me. I thank you for that. PS: There is a way (now/mid-2012) to make insight debugger work? – SOMN Aug 17 '12 at 02:16
  • I know nothing about that debugger. Perhaps that's a good candidate for a different question. – Alexey Frunze Aug 17 '12 at 06:48
0

What I've learned in the past reading websites and books is that: a) many programmers dislikes assembly language because of the reasons we all know. b) the main programming language for OS's seems to be C and even C++ c) assembly language can be used to 'speed up code' after profiling your source code in C or C++ (language doesn't matter in fact)

So, the combination of a mid level language and a low level language is in some cases inevitable. For example there is no use to speed up code for waiting on user input. If it matters to build the shortest and fastest code for one specific range of computers (AMD, INTEL, ARM, DIGITAL-ALPHA, ...) then you should use assembler. My opinion...

Agguro
  • 348
  • 3
  • 11