I have huge problems with encodings. I'm scraping text from some other sites with file_get_contents()
. And the quotes becomes special odd characters or questionmarks. But the strange thing is that some text from different sites ARE utf-8, but the quotes becomes different things when I receive it. When I run utf8_decode()
a quote from one utf-8 text becomes a quote. Bot in another utf-8 text from another site it becomes a questionmark.
Is there any way to fix so all text is looking good when I save it to db.
The charset in database table is latin1_swedish_ci
, and I have tried to change it to utf8_unicode_ci
but did no difference.
Edit:
Have now tried a little bit more. These two works for different texts. This one works for one text:
$source = utf8_encode($source);
And this are working for the others:
$source = mb_convert_encoding($source, 'HTML-ENTITIES', 'utf-8');
But you can't put the string through both. They are not working together. They destroy the other ones for each other.
Printscreen without any encoding (text is in Swedish):
Edit:
FYI: I have now changed the table to utf8_unicode_ci
. However, still not working. Here are all the functions I've tried with:
Actually, if I just leave it like this, most of the texts are outputted with right characters. It's just some where "
becomes ”
.