1

I have an non ASCII character on HTML form data and when Flask processes the character, it gives me an error like this:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 2: ordinal not in range(128)

I believe that I have to decode the request form, but I can't find a way to do that.

Here's what I have:

HTML

<body>

    <div id="avisos">
        <form action="/opcion/avisos_cadastrar/resultado" method="post"> <br>

        <fieldset>
        <legend> Aviso </legend>
        <center> <h3> Cadastrar </h3> </center>
            <br>
            Titulo: <input type="text" name="titulo" maxlength="32" autocomplete='off'> </input>
            <textarea name="aviso" id="text-area" rows="10" cols="50" maxlength="512" autocomplete='off'> </textarea>
            <br>
            <input type=submit value="enviar">

            <script>
                var area = document.getElementById("text-area");
                var message = document.getElementById("message");
                var maxLength = 512;
                var checkLength = function() {
                    if(area.value.length <= maxLength) {
                        message.innerHTML = (maxLength - area.value.length) + " caracteres restantes.";
                    }
                }
                setInterval(checkLength, 150);
            </script>

        </fieldset>

        </form>
    </div>

</body>

FlaskApp

@app.route("/opcion/avisos_cadastrar/resultado", methods = ['POST'])
def avisos_cadastrar_resultado():
    __titulo = request.form['titulo']
    __aviso  = request.form['aviso']
    query_insert_aviso = " INSERT INTO tjs_stage.avisos (Titulo, Aviso) VALUES ('{} ', '{} ')" .format(__titulo,__aviso)
    cur.execute(query_insert_aviso)
    con.commit()
    return render_template('resultado.html')

I tried to use something like..

__titulo = request.form['titulo'].decode("utf-8")
__aviso  = request.form['aviso'].decode("utf-8")

...and also...

__titulo = request.form['titulo'].decode("raw_unicode_escape")
__aviso  = request.form['aviso'].decode("raw_unicode_escape")

...but it didn't worked.

Maybe something is missing at my HTML or maybe at the FlaskApp, but I'm little bit lost.

Any ideas?

ivanleoncz
  • 9,070
  • 7
  • 57
  • 49
  • Does the traceback indicate that your error is in fact coming from the lines where you set the value of `__titulo` and `__aviso`? Seems more likely that its coming from `cur.execute()`. Also you need to be careful when executing SQL queries with data from forms. You should escape that data to prevent any potential SQL injection attacks from malicious users. – MS-DDOS Apr 08 '16 at 19:13
  • Nice tip @TylerS , thank you so much. I didn't thought about to possibility of SQL injections.. I'll improve this. Thank you! About the Traceback.. it really generates when the html form contains the character Ñ. I'm using Python 2.7.6 which by default uses ASCII encoding. Obviously, Ñ is not a character recognized by the ASCII table.. I have to admit that I never went through this situation.. – ivanleoncz Apr 08 '16 at 19:17
  • You're welcome, don't forget to give a little upvote love ;) About the encoding, I don't think this is a general python issue, I think its got to do with one of the packages you are using in your script. Python plays very nice with unicode, but give this a shot. At the top of your script: `import sys` then `reload(sys)` and finally `sys.setdefaultencoding('utf-8')` per [this](http://geekforbrains.com/post/setting-the-default-encoding-in-python) post. If it doesn't work I have another theory... – MS-DDOS Apr 08 '16 at 19:33
  • Ok. The problem is that some people have already mentioned this as a workaround, but not a good practice... could you share you theory? And please, post your initial comment as an answer, and then I can up vote your you tip ;). – ivanleoncz Apr 08 '16 at 19:40
  • You can upvote comments as well as answers. After we've found and fixed the problem I'll compile it all together and post a comprehensive answer. This helps other people who have similar problems and keeps them from sifting through lots of comments to find the solution. Did adding the `sys.setdefaultencoding()` work? If so, I'm sure we can find a more permanent solution. – MS-DDOS Apr 08 '16 at 19:45
  • Probably during my initial reputation, I cannot perform upvote over comments, only on answers ;(.. And yes it has solved! Now I have to check the charset of the table from MySQL, because it has stored the Ñ as another strange character. Thank you @TylerS ;) – ivanleoncz Apr 08 '16 at 19:51

1 Answers1

3

Does the traceback indicate that your error is in fact coming from the lines where you set the value of __titulo and __aviso? Seems more likely that its coming from cur.execute(). Also you need to be careful when executing SQL queries with data from forms. You should escape that data to prevent any potential SQL injection attacks from malicious users

To test this problem, let's try changing the default character encoding in python to unicode as described here. To do this add the following lines to the top of your script:

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

However in general the use of this method is discouraged. There are multiple more permanent solutions address in this stack overflow question but the "real" answer is that you should upgrade to python 3 where the default character encoding is already Unicode.

Hope this helps!

Community
  • 1
  • 1
MS-DDOS
  • 578
  • 5
  • 15