3

I am converting one database to another database through a script.

While running the script I am getting the following error:

SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xBFbangl...' for column 'country' at row 1  

An exception is thrown where it shows that it tries to set country as "\xcf\xbb\xbf\x62\x61\x6e\x67\x6c\x61\x64\x65\x73\x68".

When I var_dump the value, it just shows as "Bangladesh".

Is there something I have to adjust in my php code?

I have tried this, but phpmyadmin is throwing an #1064 error stating a syntax error in the query.

UPDATE

The scripts is a command in Symfony3:

<?php
namespace My\Bundle\Command;

use My\AccommodationBundle\Entity\Visit;
use My\NewAccommodationBundleV2\Entity\Visit as Visit2;
use My\OtherBundle\Entity\Person;
use Symfony\Bundle\FrameworkBundle\Command\ContainerAwareCommand;
use Symfony\Component\Console\Input\InputArgument;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Input\InputOption;
use Symfony\Component\Console\Output\OutputInterface;

class VisitConvertToNewFormatCommand extends ContainerAwareCommand
{
    private $em;

    protected function configure()
    {
        $this
            ->setName('accommodation:visit:convert')
            ->setDescription("Converting old structure to new structure'")
        ;
    }

    protected function getTimeStamp() {
        $currentTime = new \DateTime('now');
        return '['.$currentTime->format('Y-m-d H:i:s').'] ';
    }


    protected function execute(InputInterface $input, OutputInterface $output)
    {
        $output->writeln($this->getTimeStamp().$this->getDescription());

        $this->em = $this->getContainer()->get('doctrine')->getManager();

        $visits = $this->em->getRepository('MyOldAccommodationBundle:Visit')->findAll();

        foreach ($visits as $visit) {

            $visit2 = new Visit2();
            $visit2->setArrivalDate($visit->getArrivalDate());
            $visit2->setArrivalStatus($visit->getArrivalStatus());
            $visit2->setArrivalTime($visit->getArrivalTime());
            $visit2->setBookingType($visit->getBookingType());
            $visit2->setCountry($visit->getCountry()); // <----
            ...

            $this->em->persist($visit2);
            $user = $visit->getUser();

            if ($user != null) {

                $person = new Person();
                ...
                $person->setFirstName(trim(ucfirst(strtolower($user->getFirstName()))));
                $person->setLastName(trim(ucfirst(strtolower($user->getLastName()))));
                $person->setEmail(preg_replace('/\s+/', '', $user->getEmail()));

                ...

                $this->em->persist($person);

            }

        }
    }
}
Community
  • 1
  • 1
xfscrypt
  • 16
  • 5
  • 28
  • 59
  • 1
    phpmyadmin doesn't throw errors. it's a management interface. mysql-the-database spits out the error and phpmyadmin just reports it to you. what charset is that `\x50...` string in? you'd have to either convert to the charset that your table/field is using, or convert the string to what the db's using. you can't just dump random binary text into a text field and expect it to work. any byte sequence that doesn't conform to the field's charset requirements will cause errors. – Marc B Jul 19 '16 at 15:14
  • i am pulling it from a column that is in the utf8_unicode_ci, and what to put it in another column with the same collation. – xfscrypt Jul 19 '16 at 15:19
  • 1
    then you should show how you're doing it. if ANY stage of the process is using a different charset, the unicode chars can/will get corrupted. – Marc B Jul 19 '16 at 15:20
  • How do you transfer the data? Can you show the script you talk about? – JSchirrmacher Jul 19 '16 at 19:39
  • `50617765` is `Pawe`. Do you know what should follow? `e582xx` is the utf8 encoding for some Chinese character. Did you establish the connection as being utf8? Is the column/table declared utf8 or utf8mb4? – Rick James Jul 19 '16 at 20:15
  • I have updated my question by adding the script. Both columns are declared as utf8_unicode_ci. In the config of doctrine (in config.yml), the charset is set yo UTF8. – xfscrypt Jul 20 '16 at 03:25
  • Is the charset set in the doctrine config? sounds like it should be `doctrine: dbal: charset: utf8`. Also, using `utf8mb4` (and `utf8mb4_unicode_ci`) is recommended, its probably not relevent right now, but as you are updating your database structure, now is the time... – mcfedr Jul 28 '16 at 14:43
  • yes the charset is set to utf8. thanks for the comment on utf8mb4. – xfscrypt Jul 29 '16 at 05:24

2 Answers2

2

\x62\x61\x6e\x67\x6c\x61\x64\x65\x73\x68 is Bangladesh; the \xcf\xbb\xbf before it makes no sense. In latin1 it is Ï»¿. It does not validly convert to utf8. CFBB is ϻ (GREEK SMALL LETTER SAN), but then bf is broken utf8.

I suggest it is coming from the source of the data.

var_dump probably showed nothing because of what device you were displaying it on.

Rick James
  • 135,179
  • 13
  • 127
  • 222
0

Ok after days of trying this did the trick:

$country = iconv("UTF-8", "ISO-8859-1//IGNORE", $country);

I have no idea why it went wrong in the first place. if anybody knows, please share.

xfscrypt
  • 16
  • 5
  • 28
  • 59