To expand a little bit more on the encoding problem...
Any time you see one character in a source turn into two (or more characters), you should immediately suspect an encoding issue, especially if UTF-8 is involved. Here's why. (I apologize if you already know some of this, but I hope to help some future SO'ers as well.)
All characters are stored in your computer not as characters, but as bytes. Back in the olden days, space and transmission time were much more limited than now, so people tried to save every byte possible, even down to not using a full byte to store a character. Now, because we realize that we need to communicate with the whole world, we've decided it's more important to be able to represent every character in every language. That transition hasn't always been smooth, and that's what you're running up against.
Latin-1 (in various flavors) is an encoding that always uses a single 8-bit byte for a character. Which means it can only have 256 possible characters. Plenty if you only want to write English or Swedish, but not enough to add Russian and Chinese. (background on Latin-1)
UTF-8 encodes the first half of Latin-1 in exactly the same way, which is why you see most of the characters looking the same. But it doesn't always use a single byte for a character -- it can use up to four bytes on one character. (utf-8) As you discovered, it uses 2 bytes for é. But Latin-1 doesn't know that, and is doing its best to display those two bytes.
The trick is to always specify your encoding for byte streams (like info from a file, a URL, or a database), and to make sure that encoding is correct. (Sometimes that's a pain to find out, for sure.) Most modern languages, like Java and PHP do a good job of handling all the translation issues between different encodings, as long as you've correctly specified what you're dealing with.