3

I am writing a marshaling layer to automatically convert values between different domains. When it comes to floating point values this potentially means converting values from one floating point format to another. However, it seems that almost every modern system is using IEEE754, so I'm wondering whether it's actually worth generalising to allow other formats, or just manage marshaling between different IEEE754 formats.

Does anyone know of any commonly used floating point formats other than IEEE754 that I should consider (perhaps on ARM processors or mainframes)? If so, a reference to the format specification would be extremely helpful.

M. Webb
  • 157
  • 11
  • The most "universal" common format for all systems is currently plain text. And not even plain 7-bit ASCII, but the union of ASCII and [EBCDIC](https://en.wikipedia.org/wiki/EBCDIC). And not even that is *truly* "universal", but it's close. – Some programmer dude Jun 24 '18 at 08:42
  • http://www.quadibloc.com/comp/cp0201.htm – n. m. could be an AI Jun 24 '18 at 08:48
  • Thanks Some programmer dude. I probably should have made clear that the marshaling layer is for binary interoperability. While text is fine as a transfer medium it lacks the efficiencies attained by binary formats. It also needs to work with existing use cases (eg. accessing data from existing network packets). So it's specifically binary floating point formats used by processors that I'm concerned with. – M. Webb Jun 24 '18 at 08:48
  • Thanks n.m. I had seen that link. It's very informative but doesn't give me any insight into which of those many formats are actually still in use. I don't think there's too many PDP-10's crunching away out there. It will save me a lot of effort if I don't have to generalise to account for ANY possible floating point format. – M. Webb Jun 24 '18 at 08:52
  • The link is for studying different formats. Don't implement any of them just yet, solve problems as they appear ;) If a customer asks you to add support for a certain format, do so, but not in advance. See also [YAGNI](https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it). Non-IEEE hardware is likely to be made/planned today, google [reduced precision](https://www.google.com/search?q=reduced+precision). Study it but defer support you have a specific requirement. – n. m. could be an AI Jun 24 '18 at 17:29

3 Answers3

2

Virtually all relatively modern (within the last 15 years) general purpose computers use IEEE 754. In the very unlikely event that you find system that you need to support which uses a non-IEEE 754 floating point format, there will probably be a library available to convert to/from IEEE 754.

Some non-ancient systems which did not natively use IEEE 754 were the Cray SV1 (1998-2003) and IBM System 360, 370, and 390 prior to Generation 5 (ended 2002). IBM implemented IEEE 754 emulation around 2001 in a software release for prior S/390 hardware.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
1

As of now, what systems do you actually want this to work on? If you come across one down the line that doesn't use IEEE754 (which as @JohnZwinick says, is vanishingly unlikely) then you should be able to code for that then.

To put it another way, what you are designing here is, in effect, a communications protocol and you obviously seek to make a sensible choice for how you will represent a floating point number (both single precision and double precision, I guess) in the bytes that travel between domains.

I think @SomeProgrammerDude was trying to imply that representing these as text strings (while they are in transit) might offer the most portability, and if so I would agree, but it's obviously not the most efficient way to do it.

So, if you do decide to plump for IEEE754 as your interchange format (as I would) then the worst that can happen is that you might need to find a way to convert these to and from the native format used on some antique architecture that you are almost certainly never going to encounter, and if that does happen then that problem would not be not difficult to solve.

Also, floats and doubles can be big-endian or little-endian, so you need to decide what you're going to use in your byte stream and convert when marshalling if necessary. Little-endian is much more common these days so I'd go with that.

Paul Sanders
  • 24,133
  • 4
  • 26
  • 48
  • Thanks for the input. The advice is good, but the assumption is wrong. The layer I am implementing is so that the programmer does not need to handle all the possible conversions directly. So specifying the formats used in the applicable domains (eg. network and local), the type layer would automatically marshal between local and domain-specific types. I have this working fine for signed/unsigned integer types and bool, but obviously floating point opens up a whole can of worms. The consensus seems to be that I can worry about the variants of IEEE754 for now and implement others as they arise. – M. Webb Jun 25 '18 at 06:32
  • @MWebb I don't think I assumed anything did I? I was talking about the byte stream that travels between domains after marshalling. The API would of course be as consumer-friendly as possible and is unaffected by this. And yes, I agree with the consensus, use IEEE754 'on the wire', little endian. – Paul Sanders Jun 25 '18 at 08:11
  • Perhaps it was my assumption about your assumption that was wrong? My point was that I am not defining the wire format, so I have no control over the communication protocol. I simply have to come up with a way to avoid having to manually manage all the different data formats in existing protocols and data formats from different architectures. In which circumstances the marshaling layer is used is not part of the problem domain, which is why I need to be as general as possible without accounting for every possible format. – M. Webb Jun 25 '18 at 10:17
  • OK, thanks, perhaps that explains the confusion. Does IEEE754 come into it at all then? If I understand you correctly, you need to convert floats and doubles passed (in native format) across the API(s) you present to consumer into whatever format(s) are used by the communications protocol(s) you need to support. – Paul Sanders Jun 25 '18 at 11:27
  • That's pretty much it, but more general. It could be a comms protocol, or a file format (eg. image format). Its to solve the general problem of dealing with data in different formats from different architectures. Normally we have to manually marshal data as we access it, or write a wrapper to abstract it. The idea is to have a header only (C++) template class library that allows you to specify a domain (eg, Foreign: big-endian, ones-complement, IEEE754binary32, ...) so that,... – M. Webb Jun 26 '18 at 11:43
  • ... say you have a structure like struct Data { Foreign::UInt32 address; Foreign::Float32 value }, you could simply write float value = Data.value, and the marshaling would happen automatically (including endian adjustment). I ask about IEEE754 because to pull it off I need to marshal between possible FP formats. If no others are used in practice then the marshaling code can be much simpler. Having to cater for any possible FP format requires a much more general solution. – M. Webb Jun 26 '18 at 11:44
  • OK, I see what you're getting at. Just out of curiosity, what floating point format(s) travel across the network? – Paul Sanders Jun 26 '18 at 15:03
  • I haven't seen many comms specs that include floating point across the wire, it's mostly in file formats. RPC implementations seem to vary but usually allow IEEE754. It seems that an IEEE754 float would be the best choice if you were going to standardise, you'd just have to choose an endianess. It's most likely to work on just about any modern platform barring some less-than-perfect implementations, which could be worked around without too much effort. – M. Webb Jun 26 '18 at 17:12
0

Does anyone know of any commonly used floating point formats other than IEEE754 that I should consider ...?

  1. CCSI uses a variation on binary32 for select processors.

it seems that almost every modern system is using IEEE754,

  1. Yes, but... various implementations fudge on the particulars with edge values like subnormals, negative zero in visual studio, infinity and not-a-number.

It is this second issue that is more lethal and harder to discern that a given implementation has completely coded IEEE754. See __STDC_IEC_559__

OP has "I am writing a marshaling layer". It is in this coding that likely troubles remain for edge cases. Also IEEE754 does not specify endian so that marshaling issues remains. Recall integer endian may not match FP endian.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256