5

I'm working on very old legacy code and I'm porting it from 32 to 64 bit.

One of the things where I'm struggling was about the MFC serialization. One of the difference between 32 and 64 bit was the size of pointer data. This means, for example, that if for some reason I have serialized the size of a CArray like

ar << m_array.GetSize();

the data was different between 32 and 64 platform because GetSize return a INT_PTR. To get serialize data fully compatible with the same application compiled in 32 and 64 bit, I forced the data type in the storing phase, and the same on reading. (pretty sure 32 bit are enough for this data)

store

ar << (int)m_array.GetSize();

reading

int iNumSize = 0;
ar >> iNumSize ;

In other word, the application, does't matter if compiled in 32 or 64 bits, serialize this data like as int.

Now I have one doubt about the serialization of the CArray type; to serialize a CArray the code use the built CArchive serialization

//defined as CArray m_arrayVertex; on .h
m_arrayVertex.Serialize(ar);

and this Serialize is defined in the MFC file afxtemp.h with this template

template<class TYPE, class ARG_TYPE>
void CArray<TYPE, ARG_TYPE>::Serialize(CArchive& ar)
{
    ASSERT_VALID(this);

    CObject::Serialize(ar);
    if (ar.IsStoring())
    {
        ar.WriteCount(m_nSize);
    }
    else
    {
        DWORD_PTR nOldSize = ar.ReadCount();
        SetSize(nOldSize, -1);
    }
    SerializeElements<TYPE>(ar, m_pData, m_nSize);
}

where (afx.h)

// special functions for reading and writing (16-bit compatible) counts
DWORD_PTR ReadCount();
void WriteCount(DWORD_PTR dwCount);

Here my question: ReadCount and WriteCount use the DWORD_PTR that have different size between platforms... this kind of serialization is compatible at 32/64 bit or, due to the size change, the serialized data work only in each platform respectively?

I mean the data can be read by both the 32 and 64 application without errors? the comment say it works also for "16 bit" and I not found anything in details about this serialization.

If this does't work, there is a workaround to serialize the CArray in such a way the data are fully compatible with both 32 and 64 app?

Edit: Both of answer are good. I simply accept as the solution the first come. Many thanks to both, hope can help someone else!

Andrew Truckle
  • 17,769
  • 16
  • 66
  • 164
GiordiX
  • 261
  • 5
  • 15
  • 2
    Compilation/Compatibility depends on the platform, but no, the implementation is not "universal" if you like. MFC's source is open (in C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.34.31933\atlmfc\src\mfc in latest Visual Studio builds), so you can check all you need by yourself. Here it is online https://github.com/pixelspark/corespark/blob/master/Libraries/atlmfc/src/mfc/arccore.cpp#L632 – Simon Mourier Jan 10 '23 at 11:23
  • Thanks, pretty useful. I have readed the code many times, but I'm not sure how it works if the data was, for example, write at 64 and read in 32. I f I undesratnd correctly, this is not unversal but while the value is rapresentable with 4 bytes (32bit), same file should be readable by 32 and 64 bit app? – GiordiX Jan 10 '23 at 12:47
  • I always choose variable types that do not change between bit editions. – Andrew Truckle Jan 10 '23 at 12:49
  • @Andrew what do you mean? If I made an CArray should be readable by both platform if userStruct_Int is a struct with only int value inside? – GiordiX Jan 10 '23 at 12:51
  • I mean I use types like `WORD`, `DWORD` etc. `int` will change between bit editions but types like `WORD` etc stay the same. Not that I have not used `CArray`. But I have used `CObArray`. – Andrew Truckle Jan 10 '23 at 12:52
  • Serializing an `int` is always a bad idea if you intend on supporting both a 32 / 64 build of your app to read the data as `int` has a different size. This is why in the answers provided it all uses variables like I already stated. So if you are going to cast, cast to a variable that stays constant between the different bit editions. – Andrew Truckle Jan 10 '23 at 21:22
  • @AndrewTruckle OTOH `int` is always 32 bits on the Windows platform. – Jabberwocky Jan 11 '23 at 07:38
  • @Jabberwocky in my experience it is not. 32 and 64 bit int is difference. Because I have had CArchive break because of this and hence the need to use constant var types. – Andrew Truckle Jan 11 '23 at 08:00
  • @AndrewTruckle strange, just tried `printf("%zu\n", sizeof(int));`. It prints `4` even with x64. Using Visual Studio 2022. – Jabberwocky Jan 11 '23 at 08:09
  • @Jabberwocky Dunno. Maybe user was running windows on Mac. Or maybe a CArchive issue. Not to worry. – Andrew Truckle Jan 11 '23 at 08:17
  • With MFC, I'm pretty sure that only pointers change size, int reimains in 4 byte. If you nee d and int with 8 byte, you should use int64; that is not striclty correleted with the question, hard to change a legacy code of 15 years ago using only some specific data. – GiordiX Jan 11 '23 at 10:39
  • 1
    @AndrewTruckle `int` and `long` are 32 bits wide in MSVC, regardless of the target architecture. The OS that runs the code doesn't make a difference; the compiler has already made that decision. More information [here](https://stackoverflow.com/q/384502). – IInspectable Jan 12 '23 at 00:37
  • @IInspectable OK, maybe it was these INT_PTR variable types then. Odd. But thanks for info. – Andrew Truckle Jan 12 '23 at 02:08

2 Answers2

4

As you have written, ReadCount returns a DWORD_PTR which is either 32 bit or 64 bits wide depending if the code has been compiled as 32 or 64 bit code.

Now as long as the actual object count fits into 32 bits, there is no problem with interoperability between files that have been written by a 32 bit or a 64 bit program.

On the other hand if your 64 bit code serializes a CArray that has more than 4294967295 elements (which is unlikely to happen anyway), then you will run into trouble if you want to read deserialize this file from a 32 bit program. But on a 32 bit program a CArray cannot store more than 4294967295 anyway.

Long story short meaning, you don't need to do anything special, just serialize/deserialize your data.

Andrew Truckle
  • 17,769
  • 16
  • 66
  • 164
Jabberwocky
  • 48,281
  • 17
  • 65
  • 115
  • Not sure, if I understand correctly. Upgrade is not a problem, but what is If we Serialize (write) our data in x64 compilation, and then deserialiize (read) back in x32 compilation. We don't have to worry about this too? – Tom Tom Jan 11 '23 at 07:26
  • @TomTom no, this is not a problem at all. Read also IInspectable's answer which is far more detailed. – Jabberwocky Jan 11 '23 at 07:28
3

Storage and retrieval of the item count for CArray instantiations are implemented in CArchive::WriteCount and CArchive::ReadCount, respectively.

They write and read a 16-bit (WORD), 32-bit (DWORD), or 64-bit (on 64-bit platforms, DWORD_PTR) value to or from the stream. Writing uses the following algorithm:

  • If the item count is less than 0xFFFF, write the item count as a 16-bit WORD value
  • Otherwise, dump an "invalid value" marker ((WORD)0xFFFF) into the stream, followed by
    • 32-bit: The item count as a 32-bit value (DWORD)
    • 64-bit: If the item count is less than 0xFFFF'FFFF, write the item count as a 32-bit DWORD value
      • Otherwise, dump an "invalid value" marker ((DWORD)0xFFFFFFFF) into the stream, followed by the item count as a 64-bit value (DWORD_PTR)

The stream layout is summarized in the following table depending on the item count in the CArray (where ❌ denotes a value that's not present in the stream):

Item count n WORD DWORD DWORD_PTR
n < 0xFFFF n
0xFFFF <= n < 0xFFFF'FFFF 0xFFFF n
n == 0xFFFF'FFFF (32-bit only) 0xFFFF 0xFFFF'FFFF
0xFFFF'FFFF <= n (64-bit only) 0xFFFF 0xFFFF'FFFF n

When deserializing the stream the code reads the item count value, checks to see if it matches the "invalid value" marker, and continues with larger values if a marker was found.

This works across bitnesses as long as the CArray holds no more than 0xFFFF'FFFE values. For 32-bit platforms this is always true; you cannot have a CArray that uses up the entire address space.

When serializing from a 64-bit process you just need to make sure that there aren't any more than 0xFFFF'FFFE items in the array.


Summary:

For CArrays with less than 0xFFFF'FFFF (4294967295) items, the serialized stream is byte-for-byte identical regardless of whether it was created on a 32-bit platform or a 64-bit platform.

There's the odd corner case of a CArray with exactly 0xFFFF'FFFF items on a 32-bit platform1. If that were to be streamed out and read back in on a 64-bit platform, the size field in the stream would be mistaken for the "invalid value" marker, with catastrophic consequences. Luckily, that is not something we need to worry about. 32-bit processes cannot allocate containers that are a multiple of available address space in size.

That covers the scenario where a stream serialized on a 32-bit platform is consumed on a 64-bit platform. Everything works as designed, in practice.

On to the other direction then: A stream created on a 64-bit platform to be deserialized on a 32-bit platform. The only relevant disagreement here is containers larger than what a 32-bit program could even represent. The 64-bit serializer will drop an "invalid value" marker (DWORD) followed by the actual item count (DWORD_PTR)2. The 32-bit deserializer will assume that the marker (0xFFFF'FFFF) is the true item count, and fail the subsequent memory allocation without ever looking at the actual item count. Things are torn down from there using whatever exception handling is in place, before any data corruption can happen3.

This is not a novel error mode, unique to cross-bitness interoperability, though. A CArray serialized on a 32-bit platform can fail to be deserialized on a 32-bit platform just as well, if the process runs out of resources. This can happen far earlier than running out of memory, since CArrays need contiguous memory.


1 Line 3 in the table above.
2 Line 4 in the table above.
3 This is assuming there's no catch(...) up the call stack that just keeps ignoring.

IInspectable
  • 46,945
  • 8
  • 85
  • 181