0

I am using Acrobat professional to create a PDF Form, the form is filled with arabic, and I created a PHP file to get the form fields and insert it into a mysql database, I tried decoding the arabic text ãÍãÏ with utf_encode() and utf_decode() the word is محمد and the database is utf8_bin. and solutions?

<?php
error_reporting(E_ALL & ~E_NOTICE);
$servername = "localhost";
$username = "root";
$password = "";
$dbname = "dbTest";

if (isset($_POST['txtName'])) {$txtName= $_POST['txtName'];};

$link = mysqli_connect($servername, $username, $password, $dbname) or     die("UNable to connect");
$qq1 = "set character_set_server='utf8'";
$qq2 = "set names 'utf8'";

//mysqli_query($link, $qq1) or die(mysqli_error($link));
//mysqli_query($link, $qq2) or die(mysqli_error($link));
//$txtName = iconv("UTF-8//TRANSLIT//IGNORE", "Windows-1252//TRANSLIT//IGNORE", $txtName);
//$txtName = iconv("Windows-1256//TRANSLIT//IGNORE", "UTF-8//TRANSLIT//IGNORE", $txtName);

$query = "INSERT INTO  tblTest Values('$txtName')";

mysqli_query($link, $query) or die(mysqli_error($link));

?> and here is the database

CREATE TABLE IF NOT EXISTS `tbltest` (`txtName` varchar(244) COLLATE tf8_bin DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;


INSERT INTO `tbltest` (`txtName`) VALUES
('ãÍãÏ Úæäí ãÍãæÏ');
  • If the text you want to handle is UTF-8 encoded then all is fine. Do _not_ try to somehow change that. Instead take care that all components in your processing chain do use UTF-8 encoding: the http server, the php engine, the database with its tables and connection. http://stackoverflow.com/questions/279170/utf-8-all-the-way-through – arkascha May 24 '16 at 05:50
  • @arkascha I already done this, my code is UTF all the way. – Mohammad Awni Ali May 24 '16 at 05:55
  • 1
    Then what is the issue? – arkascha May 24 '16 at 05:57
  • @arkascha the arabic text محمد which is UTF8 encoded, arrives from PDF to PHP as ???? before encoding it to UTF8, then its ãÍãÏ in the database. – Mohammad Awni Ali May 24 '16 at 05:59
  • Then obviously _not_ all of your tool chain components use UTF-8 all the way through. Something modifies your data. That is what you have to change. Note that I am not talking about your code but the settings, as already written above. – arkascha May 24 '16 at 06:00
  • @arkascha I am using every thing as PDF, now what reach that database is ãÍãÏ Úæäí ãÍãæÏ Úáí, I need to encode it back to arabic. – Mohammad Awni Ali May 24 '16 at 06:14
  • No, you don't need to re-encode anything. In contrary. You should prevent anything from getting changed at all. If that PDF is _really_ using UTF-8, then a UTF-8 string should be sent to the server, should be processed by php as UTF-8 and should be stored as UTF-8 inside the database. So it _stays_ UTF-8 encoded all the way through, which currently is _not_ the case with your setup. There is no sense in first loosing the correct encoding and then blindly trying to fix that later again. That won't work. Instead you should identify which part breaks the encoding and fix that. – arkascha May 24 '16 at 06:22
  • Don't get this wrong, but I have the impression that you do not really understand what the article I referenced in my first comment actually tells you. You have to understand how the components you use work together and how you have to setup things. So if you say above "my code is UTF all the way"... what do you actually mean by that? What encoding does your http server use? What encoding does your php engine use? What encoding does you database connection use? What encoding have your database tables been setup with? Can you answer all that? – arkascha May 24 '16 at 06:25
  • There's a main information missing: How do you get the values from the PDF form fields? – Jan Slabon May 24 '16 at 14:02
  • Maybe this is a duplicate of http://stackoverflow.com/questions/12604171/data-encoding-when-submitting-a-pdf-form-using-acroform-technology – Jan Slabon May 24 '16 at 14:17
  • @Setasign I edited the question and added an example, kindly check, and no its not a duplicate question for the link you've mentioned. – Mohammad Awni Ali May 24 '16 at 14:44

1 Answers1

1

It's still not clear how you submit the data but the answer is available in the mentioned simliar question.

The charset for a standard submit action is not specified/adjustable by the PDF specification until PDF 2.0 and the reader application may choose any appropriate charset.

Use a JavaScript action to submit the form data with a specific charset instead of a submit action:

submitForm({
    cURL: "https://www.example.com/your-script.php",
    cSubmitAs: "HTML",
    cCharset: 'utf-8'
});
Community
  • 1
  • 1
Jan Slabon
  • 4,736
  • 2
  • 14
  • 29