0

I am doing in UserProfile model this:

def repeated_times(self, test, date):
    return self.user.user_test_results.filter(taken_date__month=date.month, djangotest=test).count()

but I am getting

'utf8' codec can't decode byte 0xe4 in position 169: invalid continuation byte

because this: enter image description here

(bigger image: http://content.screencast.com/users/doniyor/folders/Jing/media/5baf8537-b48f-4194-8b71-384ec880a7b4/2015-10-23_0753.png )

what am I missing?

doniyor
  • 36,596
  • 57
  • 175
  • 260

2 Answers2

2

Mitteleuropa is the German term for Central Europe. Mitteleuropäische Zeit is Central European Time.

In any case, 0xe4 is indeed a UTF-8 continuation byte and it's in the wrong place for a UTF-8 string since the preceding p (0x70, 0b01110000) is not a character that can be continued since it doesn't start with a one-bit:

Mitteleurop\xe4ische Zeit

So I'm thinking that the text you have is actually not encoded as UTF-8. In fact, code point 0xe4 is shown as the ä character in the original IBM PC code page 437.

Now I'm not sure this is actually a database problem (at least with the specific table you're querying). The actual problem appears to be in your input variable which is storing the query:

SELECT    something
    FROM  somewhere
    WHERE some condition
    AND   EXTRACT('month' FROM "djtest_result"."taken_date"
            AT TIME ZONE 'Mitteleurop\xe4isch Zeit') = 10
    ...

So I'd be looking first at whatever piece of code generated that input variable to see if it's the culprit. The timezone may come from the database, or from a config item, or it may be hard-coded. What it isn't is valid UTF-8 encoding.

Community
  • 1
  • 1
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • so `djtest_result.taken_date` is not in utf-8? – doniyor Oct 23 '15 at 06:11
  • @doniyor, that'd be my (relatively educated) guess. – paxdiablo Oct 23 '15 at 06:13
  • but how can a datetimefield not be in utf8, the db itself is in utf8: http://content.screencast.com/users/doniyor/folders/Jing/media/53fe7588-6805-4e2f-ae50-1b77484b25be/2015-10-23_0814.png any ideas? – doniyor Oct 23 '15 at 06:15
  • @doniyor, sorry, misunderstood. It's not the DB content that's badly encoded, it's the `extract ... at time zone` clause. No idea where that timezone is coming from. I'll update answer to clarify. – paxdiablo Oct 23 '15 at 06:19
  • @Jasper, there's an awful lot of text on the net where it seems to be shown *with* the umlaut (and just as much without), but I'll defer to you for now. I've sent a quick email off to some friends in Munich for confirmation :-) – paxdiablo Jul 15 '16 at 13:26
1

I had to disable USE_TZ, then it worked.

doniyor
  • 36,596
  • 57
  • 175
  • 260