0

I am trying to use superscript in php for numbers from ¹-⁹. I am using utf8_decode() to achieve this and it works fine for ¹, ² and ³ digits. But does not work for more than that. That is from digit ⁴ - ⁹, I get "?" as output. Why does this happen?

<?php 

$val =  utf8_decode("⁴");
echo $val;

?>

I also have tried doing

json_decode("\u{2074}")

but does not work. (https://wiki.freepascal.org/Unicode_subscripts_and_superscripts)

EDIT: I have now integrated https://github.com/neitanod/forceutf8 library to force my php to encode in utf-8. I am trying to directly encode for string ⁴ but it does not work. It outputs "?".

<?php

require_once "./vendor/forceutf/src/ForceUTF8/Encoding.php";
use ForceUTF8\Encoding;

$val = Encoding::fixUTF8("⁴");
echo $val;

?>

Max
  • 141
  • 1
  • 9
  • 1
    You should use html sub and super for subscripting- and super-scripting. Unicode just provide some compatibility, but it doesn't offer all possibilities (and it is not the recommended way to do it in HTML) – Giacomo Catenazzi Sep 14 '21 at 11:30
  • Huh, what's the point of using `utf8_decode` here? That changes the encoding to ISO-5589-1 - but you are still _telling_ the client via your content-type header, that the following content was in UTF-8? And what actual _problem_ are you even trying to solve here? `echo "⁴";`, and done - no? Why the juggling with different encodings to begin with? – CBroe Sep 14 '21 at 11:30
  • @CBroe you are right, sorry that was a mistake, was not suppose to be there. The problem is that I get echo $val as ```?```. – Max Sep 14 '21 at 11:32
  • 1
    That's because there is no `⁴` in ISO-8859-1.... – piet.t Sep 14 '21 at 11:36
  • The answer what is different about these first three, is already (kinda) contained in the page you linked to: _"Superscript one(¹), two (²) and three (³): See [UTF-8_Latin_characters](https://wiki.freepascal.org/Unicode_Latin-1_supplement_characters)"_ - these first three are part of the Latin-1 block in Unicode, whereas the others are explicitly located in the "subscripts and superscripts" block. – CBroe Sep 14 '21 at 11:36
  • @piet.t Oh no, what would be the alternative then? – Max Sep 14 '21 at 11:36
  • What character encoding are you using for your site? Are you _not_ using UTF-8? – CBroe Sep 14 '21 at 11:37
  • @CBroe No I dont think so. How do I display all the digits from 1-9 as superscript if only the first 3 are Latin-1 block. – Max Sep 14 '21 at 11:37
  • If you are not using UTF-8 (or another Unicode encoding variant), then you will have to use numeric HTML references. https://www.toptal.com/designers/htmlarrows/math/superscript-four/ – CBroe Sep 14 '21 at 11:38
  • @CBroe I was trying to do ```json_decode("\u{2074}")``` but I get empty string as output. – Max Sep 14 '21 at 11:41
  • The curly braces do not belong there, and you would have to decode a value that is a string _in_ JSON - `json_decode('"\u2074"')`, https://3v4l.org/iATSs But that still won't do you any good, if your page is not using UTF-8 to begin with. If it was, you could have put `⁴` directly into our (script-)files, wherever you need this character. – CBroe Sep 14 '21 at 11:47
  • @CBroe okay thank you I shall try adding utf-8 encoding in my php file. – Max Sep 14 '21 at 11:59
  • @CBroe please take a look at my question. I have updated it. – Max Sep 14 '21 at 12:42
  • You probably have to do more than that. If for example your PHP script itself is not saved in UTF-8 - then you don't have the character `⁴` in there in a proper input encoding to begin with. The tool you tried to use there, should not be necessary at all, if you properly switch your site to UTF-8, at most for external data. https://stackoverflow.com/questions/279170/utf-8-all-the-way-through has some more hints what to pay attention to. – CBroe Sep 14 '21 at 13:09
  • @CBroe it is so confusing and I am still not able to find a possible way :( – Max Sep 14 '21 at 13:37
  • Well, it is a rather complex topic, so don't expect to be done with the whole thing in five minutes. _If_ you really want to change the encoding of your whole site now. (So far, it appears you could not even properly answer the question which one it is _currently_ using, without guessing? Then you definitively have to look into the matter in more detail.) As already said, just using the numeric HTML references would work in any case, that is independent from the character encoding the page uses. – CBroe Sep 14 '21 at 13:52
  • @CBroe actually I am not using any html codes. Thing is I am using fpdf to output a pdf file with contents. https://stackoverflow.com/questions/6334134/fpdf-utf-8-encoding-how-to. from here i have found out that it uses ISO-8859-1 or Windows-1252. – Max Sep 14 '21 at 14:05
  • Well way to bury the lede ... Why you let us talk about the character encoding of your site all the time, when that's not actually the relevant part. But, as we already covered, `⁴` doesn't even exist in ISO-8859-1, and as far as I can see not in Windows-1252 either. I couldn't tell you if the numeric HTML references would work in that context; so you are probably better off by first of all switching PDF libraries (also mentioned in the thread you linked to) to one that can handle UTF-8 first. – CBroe Sep 14 '21 at 14:18
  • @CBroe oh no that is extremely late for now :( I wish there was another way to achieve this. Why wouldn't FPDF use UTF-8?! :( – Max Sep 14 '21 at 14:20
  • FPDF can write HTL, so I'd at least try and use numeric references, and see if that works. – CBroe Sep 14 '21 at 14:21
  • @CBroe apparently they have an alternate way to get superscripts - http://www.fpdf.org/en/script/script61.php. I shall try this. – Max Sep 14 '21 at 14:24

0 Answers0