0

I have two buffers (example sizes):

char c[512];
QChar q[256];

Assuming 'c' contains multibyte character string (UTF-8). I need to convert it to QChar sequence and place it in 'q'. I guess a good example of what I need could be MultiByteToWideChar function.
IMPORTANT: this operation shall not involve any explicit or implicit memory allocations, except for additional allocations on the stack, maybe. Please, do not answer if you are not sure what the above means.

Pavele
  • 3
  • 2
  • 2
    Welcome to StackOverflow! Your question raises a question for myself (and I guess others, too): "Why no allocations?" This seriously limits the use of any Qt because of Qt's use of PIMPL in most classes. – Martin Hennings Apr 05 '19 at 09:53
  • Also, do you know the codec of your multibyte string? (I guess so because of the fixed array sizes in relation 2:1) – Martin Hennings Apr 05 '19 at 09:54
  • Hi Martin. A logical question. The point here is about memory fragmentation and utilization at high frame rate operations. Therefore PIMPL is a bit of a burden in the particular case. About the codecs - Qt copes with that matter (for example when creating QString from QByteArray or const char*) I do not want anything else. Just external placeholders. Windows API provides such functions that convert wide char to multuibyte strings and vice versa. Why Qt does not? (Or does it?) – Pavele Apr 05 '19 at 11:33
  • I beg to differ - Qt interprets const char * as UTF-8 except if you explicitly tell it otherwise. Your char array can hold 512 characters. In UTF-8 that will be up to 512 QChars. – Martin Hennings Apr 05 '19 at 11:45
  • By "multibyte" you probably mean the current ansi codepage in windows, don't you? (most of those are _single_ byte in fact:) ) It looks like not everybody aware of the term, maybe would be better to clarify it in question. – max630 Apr 07 '19 at 06:58

1 Answers1

0

QChar contains an ushort as only member, so its size is sizeof(ushort).

In QString context it represents UTF-16 'characters' (code points).

So it's all about encoding here.

If you know your char const * is UTF-16 data in the same endianness / byte order as your system, simply copy it:

memcpy(q, c, 512);

If you want to initialize a QString with your const char * data, you could just interpret it as UTF-16 using QString::fromRawData():

QString strFromData = QString::fromRawData(reinterpret_cast<QChar*>(c), 256);
// where 256 is sizeof(c) * sizeof(char) / sizeof(QChar)

Then you don't even need the QChar q[256] array.

If you know your data is UTF-8, you should use QString::fromUtf8() and then simply access its inner memory with QString::constData().

Using QString with UTF-8 I don't know of any method to completely prevent heap allocations. But the mentioned way should only allocate twice: Once for the PIMPL of QString, once for the UTF-16 string data.


If your input data is encoded as UTF-8, the answer is No: You cannot convert it using Qt.

Proof: Looking at the source code of qtbase/src/corelib/codecs/qutfcodec.cpp we see that all functions for encoding / decoding create new QString / QByteArray instances. No function operates on two arrays as in your question.

Martin Hennings
  • 16,418
  • 9
  • 48
  • 68
  • Thanks, Martin. That would be straightforward. What about UTF-8? I guess a good example of what I need could be MultiByteToWideChar (https://learn.microsoft.com/en-us/windows/desktop/api/stringapiset/nf-stringapiset-multibytetowidechar) – Pavele Apr 05 '19 at 12:04
  • Thanks, Martin. I guess that means "no, there is no way of doing that". – Pavele Apr 08 '19 at 09:22
  • Even when using `MultiByteToWideChar` you need to specify the encoding. – Martin Hennings Apr 08 '19 at 11:09
  • I believe that either your constraints are too tight or Qt isn't the right tool for the job. You could probably do the conversion with Qt without allocating the strings on the heap, e.g. with `QTextStream`, but you'd need a `QTextCodec` and that would be created on the heap again. – Martin Hennings Apr 08 '19 at 11:26