Questions tagged [chardet]

chardet is a python module for encoding detection

chardet is a python module for encoding detection.

See pypi project page.

36 questions
107
votes
20 answers

Python (pip) - RequestsDependencyWarning: urllib3 (1.9.1) or chardet (2.3.0) doesn't match a supported version

I found several pages about this issue but none of them solved my problem. Even if I do a : pip show I get : /usr/local/lib/python2.7/dist-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.9.1) or chardet (2.3.0) doesn't match…
NuX_o
  • 1,173
  • 2
  • 7
  • 9
12
votes
2 answers

Encoding detection in Python, use the chardet library or not?

I'm writing an app that takes some massive amounts of texts as input which could be in any character encoding, and I want to save it all in UTF-8. I won't receive, or can't trust, the character encoding that comes defined with the data (if any). I…
Niklas9
  • 8,816
  • 8
  • 37
  • 60
9
votes
3 answers

Encoding error while parsing RSS with lxml

I want to parse downloaded RSS with lxml, but I don't know how to handle with UnicodeDecodeError? request = urllib2.Request('http://wiadomosci.onet.pl/kraj/rss.xml') response = urllib2.urlopen(request) response = response.read() encd =…
domi
  • 189
  • 1
  • 4
  • 9
9
votes
3 answers

namelist() from ZipFile returns strings with an invalid encoding

The problem is that for some archives or files up-loaded to the python application, ZipFile's namelist() returns badly decoded strings. from zip import ZipFile for name in ZipFile('zipfile.zip').namelist(): print('Listing zip files: %s' %…
Croll
  • 3,631
  • 6
  • 30
  • 63
8
votes
1 answer

Cannot uninstall chardet

I've been trying to uninstall chardet using pip, but I get the following error: "Cannot uninstall 'chardet'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial…
pedrobisp
  • 677
  • 1
  • 7
  • 14
7
votes
4 answers

Using Chardet to find encoding of very large file

I'm trying to use Chardet to deduce the encoding of a very large file (>4 million rows) in tab delimited format. At the moment, my script struggles presumably due to the size of the file. I'd like to narrow it down to loading the first x number of…
jbentley
  • 163
  • 4
  • 13
6
votes
1 answer

Pandas cannot load data, csv encoding mystery

I am trying to load a dataset into pandas and cannot get seem to get past step 1. I am new so please forgive if this is obvious, I have searched previous topics and not found an answer. The data is mostly in Chinese characters, which may be the…
a mark
  • 95
  • 1
  • 9
5
votes
2 answers

I use chardet to test encode , but i got error

import chardet a='haha' print(chardet.detect(a)) TypeError: Expected object of type bytes or bytearray, got: < class 'str'> I just type code from tutorial. I really can not figure out what wrong happended.
kovac
  • 307
  • 4
  • 11
3
votes
1 answer

Trying to guess the encoding of a file using chardet

I'm writing a program that works with CSV files. These files can have a specific encoding. I'm trying to incorporate a procedure to try to guess what the encoding of a file the user wants to open using chardet. I'm trying with the following…
3
votes
0 answers

Chardet detects no encoding

I want some data from a website with the following url: http://www.iex.nl/Ajax/ChartData/interday.ashx?id=360113249&callback=ChartData I think the data is Json. Going to the url in my browser, I can read the data. In python I have the following…
3
votes
1 answer

In Python, how to begin with chardet module?

I would like to try some code that uses the chardet module. This is the code i have found on the web : import urllib2 import chardet def fetch(url): try: result = urllib2.urlopen(url) rawdata = result.read() encoding =…
user2305415
  • 172
  • 1
  • 3
  • 18
2
votes
0 answers

Python import pdfplumber error " ModuleNotFoundError: No module named 'chardet' "

I encountered an error while importing pdfplumber in Python3, indicating module chardet is missing. However, running pip list from the cmd confirms that the package is installed, version 3.0.4. Anyone had similar experience? Any resolution? Error…
LotusStack
  • 21
  • 2
2
votes
1 answer

chardet apparently wrong on Big5

I'm decoding a large (about a gigabyte) flat file database, which mixes character encodings willy nilly. The python module chardet is doing a good job, so far, of identifying the encodings, but if hit a stumbling block... In [428]:…
SingleNegationElimination
  • 151,563
  • 33
  • 264
  • 304
2
votes
1 answer

How to decode unknown encoded string in python, tried chardet?

I don't know encoding type of string and I want to decode that string. I have tried chardet python module but didn't work. I know output of string, is there anyway i can decode string using…
Narcos
  • 23
  • 4
2
votes
0 answers

Exe file says - cannot import name chardet

I'm trying to make an exe file using py2exe. The problem is that when I try to run created exe file, it returns that it cannot import name chardet. Traceback (most recent call last): File "orsr_parser.py", line 10, in File…
Milano
  • 18,048
  • 37
  • 153
  • 353
1
2 3