Portability of C code for different memory addressing schemes

Question

If I understand correctly, the DCPU-16 specification for 0x10c describes a 16-bit address space where each offset addresses a 16-bit word, instead of a byte as in most other memory architectures. This has some curious consequences, e.g. I imagine that sizeof(char) and sizeof(short) would both return 1.

Is it feasible to keep C code portable between such different memory addressing schemes? What would be the gotchas to keep in mind?

edit: perhaps I should have given a more specific example. Let's say you have some networking code that deals with byte streams. Do you throw away half of your memory by putting only one byte at each address so that the code can stay the same, or do you generalize everything with bitshifts to deal with N bytes per offset?

edit2: The answers seem to focus on the issue of data type sizes, which wasn't the point - I shouldn't even have mentioned it. The question is about how to cope with losing the ability to address any byte in memory with a pointer. Is it reasonable to expect code to be agnostic about this?

So long as you don't do things like assume CHAR_BIT is always 8 then there isn't a huge issue. — Flexo, Apr 11 '12 at 14:42
So long as you follow the C standard and do not make assumptions about things that the standard says are variable or result in undefined or unspecified or implementation-defined behavior or results, you should be fine. — Alexey Frunze, Apr 11 '12 at 15:29

unwind · Answer 1 · 2012-04-11T14:48:45.610

9

It's totally feasible. Roughly speaking, C's basic integer data types have sizes that uphold:

sizeof (char) <= sizeof (short) <= sizeof (int) <= sizeof (long)

The above is not exactly what the spec says, but it's close.

As pointed out by awoodland in a comment, you'd also expect a C compiler for the DCPU-16 to have CHAR_BIT == 16.

Bonus for not assuming that the DCPU-16 would have sizeof (char) == 2, that's a common fallacy.

edited Apr 11 '12 at 14:48

answered Apr 11 '12 at 14:43

unwind

391,730
64
469
606

11

It should be mentioned that `sizeof(char)` is **always** 1. – BlueRaja - Danny Pflughoeft Apr 11 '12 at 15:13

Brett Hale · Answer 2 · 2012-04-12T10:45:40.883

When you say, 'losing the ability to address a byte', I assume you mean 'bit-octet', rather than 'char'. Portable code should only assume CHAR_BIT >= 8. In practice, architectures that don't have byte addressing often define CHAR_BIT == 8, and let the compiler generate instructions for accessing the byte.

I actually disagree with the answers suggesting: CHAR_BIT == 16 as a good choice. I'd prefer: CHAR_BIT == 8, with sizeof(short) == 2. The compiler can handle the shifting / masking, just as it does for many RISC architectures, for byte access in this case.

I imagine Notch will revise and clarify the DCPU-16 spec further; there are already requests for an interrupt mechanism, and further instructions. It's an aesthetic backdrop for a game, so I doubt there will be an official ABI spec any time soon. That said, someone will be working on it!

Edit:

Consider an array of char in C. The compiler packs 2 bytes in each native 16-bit word of DCPU memory. So if we access, say, the 10th element (index 9), fetch the word # [9 / 2] = 4, and extract the byte # [9 % 2] = 1.

Let 'X' be the start address of the array, and 'I' be the index:

SET J, I
SHR J, 1    ; J = I / 2
ADD J, X    ; J holds word address
SET A, [J]  ; A holds word
AND I, 0x1  ; I = I % 2 {0 or 1}
MUL I, 8    ; I = {0 or 8} ; could use: SHL I, 3
SHR A, I    ; right shift by I bits for hi or lo byte.

The register A holds the 'byte' - it's a 16 bit register, so the top half can be ignored. Alternatively, the top half can be zeroed:

AND A, 0xff ; mask lo byte.

This is not optimized, but it conveys the idea.

So it is possible for C compilers to emulate byte-addressable memory? Interesting! — Wim Coenen, Apr 12 '12 at 08:29
...but wouldn't that halve the available memory if you stick to a 16-bit address space? Or would the compiler add bits to the addresses - in this case, one extra bit to address the low or high byte? — Wim Coenen, Apr 12 '12 at 08:43
I think I already understood that. I meant that the *largest possible* pointer `0xFFFF` in the emulated byte-addressable memory would be translated to the word address `0x7FFF` (followed by a bitshift). That leaves half of the memory non-addressable. Unless you use 17-bit pointers. — Wim Coenen, Apr 12 '12 at 14:39
@WimCoenen - yes, you're right. And not being able to address all 8-bit bytes in memory makes DCPU-16 a poor candidate for a 'reasonable' C environment. A 16-bit 'char' makes simple byte processing code waste already limited memory. OTOH, if it's just a matter of porting C code, then hacking 0x10c sort of loses its appeal. — Brett Hale, Apr 12 '12 at 18:14

score 0 · Answer 3 · answered Apr 11 '12 at 18:00

The equality goes rather like this:

1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)

The short type can be 1, and as a matter of fact maybe you'll even want the int type to be 1 too actually (I didn't read the spec, but I'm supposing the normal data type is 16 bit). This stuff is defined by the compiler.

For practicity, the compiler may want to set long to something larger than int even if it requires the compiler doing some extra work (like implementing addition/multiplication etc in software).

This isn't a memory addressing issue, but rather a granularity question.

score 0 · Answer 4 · answered Apr 11 '12 at 20:08

yes it is entirely possible to port C code

in terms of data transfer it would be advisable to either pack the bits (or use a compression) or send in 16 bit bytes

because the CPU will almost entirely communicate only with (game) internal devices that will likely also be all 16 bit this should be no real problem

BTW I agree that CHAR_BIT should be 16 as (IIRC) each char must be addressable so making CHAR_BIT ==8 will REQUIRE sizeof(char*) ==2 which will make everything else overcomplicated

Portability of C code for different memory addressing schemes

4 Answers4