Questions tagged [unidecoder]

`Unidecoder` is library provides methods to transliterate Unicode characters to an ASCII approximation.

Unidecoder is library provides methods to transliterate Unicode characters to an ASCII approximation.

17 questions
8
votes
1 answer

Pandas apply unidecode to several columns

I am trying to convert all the elements of two pandas series from a pandas data frame, which aren't ascii characters to ascii. Simply apply the function to the relevant columns doesnt work. Python only shows an attribute error stating that 'series'…
S.K.
  • 365
  • 1
  • 3
  • 17
3
votes
2 answers

using foreign language in django slug field is not working

this question might get a bit large,i will try to explain perrty much everything whats going on.below is my heading model which fills the slug field itself by whatever is the title: class Heading(models.Model): category =…
Ashish
  • 409
  • 1
  • 9
  • 24
2
votes
1 answer

How to fix an accented string [python]

I'm using a webapp to retrieve data from results of a game I play. As I'm brazilian and my language has some latin accented characters, most of the data I retrieve comes in a bad shape for use. Like: Carlos Lopez = Carlos Lã³Pez I searched internet…
Ramon Barros
  • 53
  • 1
  • 9
2
votes
1 answer

How to make Django prepopulated_fields work with Chinese?

The python package called unidecode which will work well for decoding Chinese characters is included in my project. But when I use it in my Django project, the prepopulated_fields didn't work with Chinese. Version Information: django 1.86,Python…
Kami Wan
  • 724
  • 2
  • 9
  • 23
2
votes
1 answer

Unidecode inconsistent behavior when using pyinstaller

I'm building a script that reads information from a website and manipulates it. The page may contain some special characters like ã, ç, ó, etc. In order to simplify decoding issues, I use unidecode, like this: # coding=utf-8 from unidecode import…
Rodrigo López
  • 4,039
  • 1
  • 19
  • 26
2
votes
2 answers

"\x9D" to UTF-8 in conversion from Windows-1252 to UTF-8

I have created a csv uploader on my rails app, but sometimes I get an error of "\x9D" to UTF-8 in conversion from Windows-1252 to UTF-8 This is the source to my uploader: def self.import(file) CSV.foreach(file.path, headers: true, encoding:…
DevanB
  • 377
  • 1
  • 6
  • 16
1
vote
1 answer

read_excel function changes some characters to unicode ones since pandas package upgrade

I have upgraded pandas package, new version is : 1.4.2 and xlrd package, new version is 2.0.1 Now, when I read python file with the following command : import pandas as pd pd.read_excel('myfile.xlsx') I got the following warning: UserWarning:…
HelloSmacl
  • 11
  • 1
1
vote
1 answer

How to customize unidecode?

I'm using unidecode module for replacing utf-8 characters. However, there are some characters, for example greek letters and some symbols like Å, which I want to preserve. How can I achieve this? For example, from unidecode import unidecode test_str…
meTchaikovsky
  • 7,478
  • 2
  • 15
  • 34
1
vote
1 answer

Remove accents from a python dataframe

I have a dataframe that looks like: words Atlántica Común Guión and I want to remove all accents from each elemnt. What I'm doing is: from unidecode import unidecode unidecode.unidecode(df['words']) as a result I'm obtaining an…
alelew
  • 173
  • 3
  • 13
0
votes
0 answers

How can I change python code to scrape text with accented characters?

I wrote a code to scrape articles from a particular website so that I can put the csv created from this code to Geneea (text analysis program). The problem is that I wrote this code using unicode, but I then realized I need to scrape the text with…
0
votes
3 answers

Python / Pandas: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcd in position 133: invalid continuation byte

I'm trying to build a method to import multiple types of csvs or Excels and standardize it. Everything was running smoothly until a certain csv showed up, that brought me this error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcd in…
aabujamra
  • 4,494
  • 13
  • 51
  • 101
0
votes
2 answers

unidecode a text column from postgres in python

I am new to Python and I want to take a column "user_name" from a postgresql database and remove all the accents from the names. Postgres earlier had a function called unaccent but it doesn't seem to work now. So, I resorted to Python. So far I…
Sravee
  • 113
  • 5
  • 14
0
votes
1 answer

Eksport unidecode database of ascii equivalents of international characters

How to export data from unidecode python module for use in another language? This module converts unicode characters to latin (ascii) characters, roughly preserving phonetic meaning like this: kožušček => kozuscek 北亰 -> Bei Jing Москва ->…
Tometzky
  • 22,573
  • 5
  • 59
  • 73
0
votes
1 answer

How to save non ASCII Characters in Mongo DB

This question is repeated, but I can not find answer to problem in my context. I am trying to save Aéropostale as string in mongo DB: name='Aéropostale' obj=Mongo_Object() obj.name=name obj.save() When I save the object, I get following error:…
jugadengg
  • 99
  • 1
  • 3
  • 15
0
votes
1 answer

Run method during CSV upload

I have a simple CSV uploader below that is going row by row and creating a new record (event). I also have the unidecoder gem in use and would like to call the to_ascii method on a field (the description field) that is in every record being created…
DevanB
  • 377
  • 1
  • 6
  • 16
1
2