I'm sorry my title is not better, but I'm not even sure how to categorize this problem. I know this has to do with encoding, but I am not sure how.
I am doing a project for an ESP. Their emails are 7-bit encoded, with utf-8 character set (which doesn't really make sense to me).
Exhibit A:
I get the html email text via an API. I then use PHP to modify some of the text (via a str_replace), and then post the new html via the API.
All is fine, except every time I post, I am getting some strange characters, i.e. every time I run the code it adds another funky character.
Here is the affected section of the email before I make any changes (this is in "view" mode, i.e. how a browser would see it):
Here is the code that produces that Copyright symbol AND the A with the "acute" symbol above it:
© 2012 H
What's strange is that the only way to get rid of that A with the "acute" symbol above it is to delete the copyright symbol...somehow they are related.
Every time I post to the API via PHP, I get some new funky characters, thus:
1st post:
2nd post:
3rd post:
It's so strange...this is the only part that is not working! Please help...this is making me crazy! :-)
EDIT:
Here's the relevant PHP:
Get the html from an xml response:
$html = (string)$data;
Replace some stuff:
$newHTML = str_replace($oldExpiresString, $newExpiresString, $html);
Put the new HTML into the xml post variables:
$input = ''.$newHTML.'';
URLEncode it:
$formatted = urlencode($input);
Post via curl:
$postVariables = array( 'type' => urlencode($type), 'activity' => urlencode($activity), 'input' => urlencode($input) );
$rawResponseString = post_url($urlBase, $postVariables); print $rawResponseString;