Questions tagged [windows-1252]

Windows-1252 or CP-1252 is a character encoding of the Latin alphabet. It is the default character encoding used by text editors in the English version of Microsoft Windows. It defines 27 characters not present in the related ISO-8859-1 encoding. Microsoft recommends developers use a Unicode character encoding instead.

The Windows-1252 code page is used by the Windows operating system to display a number of Latin-based languages. This character set mimics the ISO 8859-1 (Latin-1) character set, except that it adds 27 characters for bytes 128-159 which are undefined in ISO 8859-1.

The languages represented by CP-1252 include English, Spanish, and various Germanic/Scandinavian languages.

References

180 questions
129
votes
3 answers

.NET Core doesn't know about Windows 1252, how to fix?

This program works just fine when compiled for .NET 4 but does not when compiled for .NET Core. I understand the error about encoding not supported but not how to fix it. Public Class Program Public Shared Function Main(ByVal args As String())…
Joshua
  • 40,822
  • 8
  • 72
  • 132
48
votes
13 answers

Windows-1252 to UTF-8 encoding

I've copied certain files from a Windows machine to a Linux machine. All the files encoded with Windows-1252 need to be converted to UTF-8. The files which are already in UTF-8 should not be changed. I'm planning to use the recode utility for…
Sam
  • 483
  • 1
  • 4
  • 4
45
votes
5 answers

What is the exact difference between Windows-1252 and ISO-8859-1?

We are hosting PHP apps on a Debian-based LAMP installation. Everything is quite OK – performance-, administrative-, and management-wise. However, being somewhat new developers (we're still in high school) we've run into some problems with the…
user2831360
42
votes
3 answers

Correctly reading text from Windows-1252(cp1252) file in python

so okay, as the title suggests the problem I have is with correctly reading input from a windows-1252 encoded file in python and inserting said input into SQLAlchemy-MySql table. The current system setup: Windows 7 VM with "Roger Access Control…
Krisjanis Zvaigzne
  • 495
  • 1
  • 6
  • 7
29
votes
2 answers

Windows -1252 is not supported encoding name

I am working with windows 10 universal App and the ARM CPU to create apps for the Raspberry Pi. I get the following error with encoding: Additional information: 'windows-1252' is not a supported encoding name. For information on defining a custom…
Muhand Jumah
  • 291
  • 1
  • 3
  • 3
29
votes
3 answers

How to read a file in Java with specific character encoding?

I am trying to read a file in as either UTF-8 or Windows-1252 depending on the output of this method: public Charset getCorrectCharsetToApply() { // Returns a Charset for either UTF-8 or Windows-1252. } So far, I have: String fileName =…
IAmYourFaja
  • 55,468
  • 181
  • 466
  • 756
19
votes
5 answers

Python - dealing with mixed-encoding files

I have a file which is mostly UTF-8, but some Windows-1252 characters have also found their way in. I created a table to map from the Windows-1252 (cp1252) characters to their Unicode counterparts, and would like to use it to fix the mis-encoded…
Keith Hughitt
  • 4,860
  • 5
  • 49
  • 54
13
votes
2 answers

How to convert Windows-1252 characters to values in php?

We have several database fields that contain Windows-1252 characters: an example pain— if you’re Those values map to the desired values from this list: http://www.i18nqa.com/debug/utf8-debug.html I've tried various permutations of htmlentites,…
dbcn
  • 641
  • 1
  • 9
  • 20
8
votes
1 answer

How to read a non-UTF8 encoded csv file?

With the csv crate and the latest Rust version 1.31.0, I would want to read CSV files with ANSI (Windows 1252) encoding, as easily as in UTF-8. Things I have tried (with no luck), after reading the whole file in a Vec: CString OsString Indeed,…
Linker Storm
  • 151
  • 1
  • 9
7
votes
3 answers

Is Windows 1252 a subset of UTF-8 or not?

I just want to know if windows 1252 is a subset of UTF-8 or not? and what are the differences? Thinking of migrating my DB from windows 1252 to UTF-8, any thoughts, opinions?
samg
  • 311
  • 1
  • 8
  • 21
7
votes
1 answer

Replacing special characters from different encodings in r

I have a corrupted file where Windows-Special Characters have been replaced by their UTF-8 "equivalents". I tried to write a function that is able to replace the special characters based on this table: utf2win <- function(x){ soll <- c("À", "Á",…
Seb
  • 5,417
  • 7
  • 31
  • 50
7
votes
2 answers

Java can't see file on file system that contains illegal characters

I am experimenting with an edge case we're seeing in production. We have a business model where clients generate text files and then FTP them to our servers. We ingest those files and process them on our Java backend (running on CentOS machines). …
IAmYourFaja
  • 55,468
  • 181
  • 466
  • 756
7
votes
2 answers

Character Set Special Characters

Is iso-8859-1 a proper subset of utf-8? What about iso-8859-n? What about windows-1252? If the answer is no to any of the above, what are the disjoint characters? I'm testing some logic that detects charsets and want to write tests to verify the…
Sean Jezewski
  • 145
  • 1
  • 7
6
votes
1 answer

Converting Unicode to Windows-1252 for vCards

I am trying to write a program in C# that will split a vCard (VCF) file with multiple contacts into individual files for each contact. I understand that the vCard needs to be saved as ANSI (1252) for most mobile phones to read them. However, if I…
GPX
  • 3,506
  • 10
  • 52
  • 69
6
votes
1 answer

Understanding results of PHP's mb_detect_encoding and mb_check_encoding functions

I'm trying to understand the logic of the two functions mb_detect_encoding and mb_check_encoding, but the documentation is poor. Starting with a very simple test string $string = "\x65\x92"; Which is lowercase 'a' followed by a curly quote mark…
Dom
  • 2,980
  • 2
  • 28
  • 41
1
2 3
11 12