0

I've got class for uploading my csv files with holidays to my fullcalendar. It looks like this:

class UploadVacationsView(APIView):
    def put(self, request, *args, **kwargs):
        try:
            # check file type
            mime = MimeTypes()
            url = urllib.pathname2url(request.FILES['file']._name)
            mime_type = mime.guess_type(url)
            if 'text/csv' not in mime_type:
                raise APIException(code=400, detail='File type must be CSV')
            vacations_list =[]
            csv_file = StringIO(request.data.get('file', None).read().decode('utf-8'))
            user_tz = pytz.timezone(request.user.common_settings.time_zone)
            schedule_file = ScheduleFile.objects.create(user=request.user)
            instance_hebcal = HebcalService()
            events = instance_hebcal.process_csv(csv_file, user_tz)
        ...

And in the other class, I've got a method that works with csv files:

class HebcalService(...):
    def process_csv(self, csv_file, user_tz):
        events = []
        csv_input = csv.reader(csv_file.readlines(), dialect=csv.excel)
        curr_row = 1
        start_date = None
        end_date = None
        start_name = None
        holiday_name = ''
        last_event = {'subject': '',
                     'date': '',
                     }

        for row in list(csv_input)[1:]:
            subject, date, time, _, _, _, _ = row[:7]
            curr_row += 1
            row = [unicode(cell.strip(), 'utf-8') for cell in row]

            if 'lighting' in subject and not start_date:
                start_date = user_tz.localize(format_datetime(date, time))
                if date == last_event['date']:
                    start_name = last_event['subject']

Everything is ok when working with english holiday's names but when I encounter hebrew names it shots an error:

Traceback (most recent call last):
  File "/home/stas/work/vacation/vmode/apps/marketplaces/base/api/views.py", line 47, in put
    events = instance_hebcal.process_csv(csv_file, user_tz)
  File "/home/stas/work/vacation/vmode/apps/marketplaces/base/services/hebcal.py", line 106, in process_csv
    for row in list(csv_input)[1:]:
UnicodeEncodeError: 'ascii' codec can't encode characters in position 19-23: ordinal not in range(128)

I've read about making all strings to unicode but don't understand where it gets that default ASCII encoding, how can I handle it and save string with holiday_name from csv file?

Ralf
  • 16,086
  • 4
  • 44
  • 68
Stanislav
  • 67
  • 2
  • 8
  • Can you check the encoding of your csv file? – schrodingerscatcuriosity Dec 22 '17 at 18:07
  • 1
    Chek this answer, it may be what you need https://stackoverflow.com/a/47945317/6005145 – schrodingerscatcuriosity Dec 22 '17 at 18:57
  • Are you still using Python 2.x? Because the `csv` module supports Unicode out of the box in Python 3, but it's a pain to use it with non-English text in Python 2. – lenz Dec 22 '17 at 19:11
  • @guillermo chamorro, I checked using unicodeDammit it gave me: [windows-1252], [iso-8859-1], [iso-8859-2]. I thought about using BitesI0, but all my sorting is based on strings match – Stanislav Dec 25 '17 at 09:09
  • @lenz, yes work project is in Python 2.7 and I can do nothing with it – Stanislav Dec 25 '17 at 09:11
  • 1
    You mean you can do nothing about the Python version? Well, than I suggest (1) you use [unicodecsv](https://pypi.python.org/pypi/unicodecsv) (`pip install unicodecsv`), and (2) you make your code Unicode-aware, ie. use Unicode strings (`u"..."`) everywhere, otherwise you'll continue to have implicit coercion with potential encoding errors. To ease the pain with the `u"..."` strings, consider using `from __future__ import unicode_literals`. – lenz Dec 25 '17 at 19:30

0 Answers0