-1

I want to create a file in PHP with the following conditions:

  1. Generate a file with a combination of String values received from Input Boxes and Hex values side by side
  2. Hex value should be UTF-16 (LE) (BOM).

More information:

  • I want to receive values from the user through input boxes, for example name, last name, age and score
  • The value of age and score should be converted into Hex UTF-16 (BOM) and stored with the values of name and... which are of String type in a file with any extension, it doesn't matter (for example, of TXT type).

Let me give an example:

Score = 64
Name = Jack
Last Name = Saliv
Age = 21
By default, this file is generated as follows:

64JackSaliv21
And the Hex value becomes this value:
36344A61636B53616C69763231
But I want to get the following value:

䃿JackSaliカ
And the Hex value is:
FFFEFF404A00610063006B00530061006C00690076FFFEFF15

.

In the following topics, I didn't get the results I wanted!

Chris Haas
  • 53,986
  • 12
  • 141
  • 274
Adam Luper
  • 230
  • 9
  • Maybe I’m not thinking correctly, but wouldn’t the string `64` be represented as `36 00 34 00` in UTF-16 LE? – Chris Haas May 14 '23 at 17:09
  • @ChrisHaas, Yes, the UTF-16 (LE) value of 64 is 36 00 34 00, but before that, the value of 64 must be converted to Hex, and then the Hex value of 64, which is 40, should be saved in the final file. – Adam Luper May 14 '23 at 20:54
  • 1
    So you want to store the number 64, not the string 64. UTF is about encoding characters, not numbers. So `40` isn’t the UTF (LE or any) encoding of 64, it is just the decimal number 64 convert to base 16. I just want to be clear. The only UTF stuff would be the name them. Right? – Chris Haas May 14 '23 at 21:55
  • The string that you say that you _want_ is complete nonsense. It is two one-byte integers stored in binary, each preceded by a nonsensical UTF16LE BOM, with a UTF16LE string in the middle of them. Please explain why you think you want this string. – Sammitch May 14 '23 at 22:15
  • @ChrisHaas, Yes to some extent, eventually I want to approach something like this answer: https://stackoverflow.com/a/76192895/21197857 – Adam Luper May 14 '23 at 22:19
  • Can you show what you've tried before to solve this issue? – Blue Robin May 15 '23 at 14:00

1 Answers1

1

Reading through your Delph questions, along with your comments, I really don't fully understand what all this is supposed to do. However, given the input data along with the output, I can at least get pretty close, and exact with some additional corrections.

Basically, you can use mb_convert_encoding to get UTF-16LE, and then you can get the bytes as hex strings using unpack. To convert the decimal numbers to hex strings, just use dechex.

From your sample, I'm not certain why the decimal number 64 when converted to the hex string 40 is written as FF 40 and not just 40 or 40 00, but the $finalStringWithNumbersPadded variable has the FF appended. If the age or score were greater than 255 I'm not really sure what to do since FF is 1111 1111 and completely full.

Also, I'm assuming you have a typo in your expected bytes, the character for the last name should by 76 00 and not just 76.

Hopefully the code speaks for itself:

$score = 64;
$firstName = 'Jack';
$lastName = 'Saliv';
$age = 21;

$expected = 'FFFEFF404A00610063006B00530061006C00690076FFFEFF15';
$expectedCorrected = 'FFFEFF404A00610063006B00530061006C0069007600FFFEFF15';

// Convert to UTF-16LE and get hex, https://stackoverflow.com/a/16080439/231316
$nameAsUtf16LE = unpack('H*', mb_convert_encoding($firstName.$lastName, 'UTF-16LE', 'UTF-8'));

// UTF-16LE BOM
$utf16LeBom = 'FFFE';

// Convert to hex strings
$scoreAsHex = dechex($score);
$ageAsHex = dechex($age);

$finalString = sprintf(
    '%1$s%2$s%3$s',
    $utf16LeBom.$scoreAsHex,
    strtoupper($nameAsUtf16LE[1]), // This is 1-based, not 0-based
    $utf16LeBom.$ageAsHex,
);

$numberPadding = 'FF';
$finalStringWithNumbersPadded = sprintf(
    '%1$s%2$s%3$s',
    $utf16LeBom.$numberPadding.$scoreAsHex,
    strtoupper($nameAsUtf16LE[1]), // This is 1-based, not 0-based
    $utf16LeBom.$numberPadding.$ageAsHex,
);

echo 'Calculated         : '.$finalString;
echo PHP_EOL;
echo 'Padded             : '.$finalStringWithNumbersPadded;
echo PHP_EOL;
echo 'Expected           : '.$expected;
echo PHP_EOL;
echo 'Expected Corrected : '.$expectedCorrected;

assert($finalStringWithNumbersPadded === $expectedCorrected);

/* Output:
Calculated         : FFFE404A00610063006B00530061006C0069007600FFFE15
Padded             : FFFEFF404A00610063006B00530061006C0069007600FFFEFF15
Expected           : FFFEFF404A00610063006B00530061006C00690076FFFEFF15
Expected Corrected : FFFEFF404A00610063006B00530061006C0069007600FFFEFF15
*/

Demo: https://3v4l.org/XTdjA#v8.2.6

Edit

To write a string of hex characters to disk as binary data you can use pack:

$bindata = pack('H*', 'FFFEFF404A00610063006B00530061006C0069007600FFFEFF15');
file_put_contents('testing.data', $bindata);
Chris Haas
  • 53,986
  • 12
  • 141
  • 274
  • 1
    Thank you, this is exactly what I wanted, unfortunately I am not able to give feedback (You need at least 15 reputation to cast a vote), but I am definitely very happy that there are professional users like you on this site. – Adam Luper May 16 '23 at 06:33
  • 1
    Is it possible to give an explanation about these values and what they do? strtoupper($nameAsUtf16LE[1]) I mean this value [1]. And what is the use of this '%1$s%2$s%3$s' value? Do these codes need special resources, software or hardware to run on the host? (Do they put pressure on the host?) – Adam Luper May 16 '23 at 06:42
  • 1
    Can I add more values to the sprintf command? For example, postal code or ID number (type of score and age) and address and description (such as firstName and lastName)? If I want to add these values to the codes, how will the codes be? – Adam Luper May 16 '23 at 06:50
  • 1
    I'm a weird person that always uses the long-form for [`printf`](https://www.php.net/manual/en/function.sprintf.php) commands. In the string `%1$s`, the `1$` means "the first placeholder provided" and the `s` means to interpret it as a string. You are in no way required to use `printf` at all, I just did it because sometimes I think it makes it easier to read instead of concatenating, there's zero difference in this context, however you can also do whatever you want with it: https://3v4l.org/lPrUK – Chris Haas May 16 '23 at 07:18
  • 1
    `$nameAsUtf16LE` is a variable created by first converting the provided string to UTF-16LE, and then running the [`unpack`](https://www.php.net/manual/en/function.unpack.php) function on. `pack` and `unpack` are honestly weird functions that you could spend hours diving into, but just know that `unpack` in this case gets you hex values of the provided string. The result of `unpack` is an array, and for internal reasons there's only one value and is it at index `1`, hence `$nameAsUtf16LE[1]`. The `strtoupper()` is just to force the hex to be upper case to match your samples, `fffe` vs `FFFE`. – Chris Haas May 16 '23 at 07:25
  • In the grand scheme of things I don't think this code would be noticeable at all. If you click the 3v4l link provided you'll see a Performance tab which shows the insignificance. "Do these codes need special resources ... to run on the host" - Just PHP with the [mbstring extension](https://www.php.net/manual/en/book.mbstring.php) which I think is pretty commonly installed by default these days. – Chris Haas May 16 '23 at 07:31
  • Great, what method should I use to store the final value (eg FFFE404A00610063006B00530061006C0069007600FFFE15)? (I want the final value to be saved as Hex in a file, that is, if we open the file with the Hex Editor, this value will display FFFE404A00610063006B00530061006C0069007600FFFE15) I used the method https://stackoverflow.com/a/9973915/21197857, but it did not save the Hex value, but saved the value itself as a String! – Adam Luper May 16 '23 at 11:43
  • @AdamLuper, I've added a sample at the bottom how to write to disk. I will note that trying to create a string representation of the data for visual inspection, then converting that back to binary as we're doing, is a little odd/inefficient. That said, unless you were processing lots of data I don't really think it matter much, and because this is a little strange (no offense) it makes it much easier to visually debug and possibly test. – Chris Haas May 16 '23 at 13:26
  • You taught me many new lessons, I am grateful to you. I mostly wanted to ask this question in PHP to learn and I learned a lot and I really apologize if the questions I asked were simple and strange because I was very interested to know the results of a code with a similar approach in two different platforms!! (Delphi and PHP) and I didn't imagine that PHP has the ability to be so flexible, of course I think that these codes are not very useful for others or are outdated or maybe not useful at all for someone else, but these tips from You learned it was great for me. – Adam Luper May 16 '23 at 14:43
  • I encountered a problem and I want to find a basic solution for it, if the Hex values of Age and Score are single letters (for example, the values of 10 to 15 whose Hex becomes A, B, C, D, E, F, etc...) How should I put a zero before these single letter values so that these values are not mixed with the other values next to them (or what should I do to convert the odd value into an even value! For example, if I'm not mistaken, I think the value of A It should be 0A) so that the single-letter value becomes two-letter? (Given that the Hex values are two letters -> 00) – Adam Luper May 16 '23 at 17:14
  • 1
    You are jumping around between several different encodings and number systems which is really creating confusion. and you are also (effectively) creating a binary file format. There's a BOM at the beginning and towards the end for unclear reasons, as we noted that attaching a BOM to numeric data doesn't make any sense. Most/all binary file formats have a well-defined structure, either fixed length entries, sigils for data interpretation/termination, byte counts for variable length structures or byte pointers. Once this file gets "perfected", I'm not really sure how you'd read it back. – Chris Haas May 16 '23 at 18:07
  • 1
    Anyway, another way to convert an integer to a hex string is to use sprintf, see: https://stackoverflow.com/a/47789382/231316, but you might also need to convert the endianness: https://stackoverflow.com/a/35100432/231316 – Chris Haas May 16 '23 at 18:10