Questions tagged [shift-jis]

Shift JIS is a character encoding for the Japanese language

108 questions
7
votes
3 answers

decoding shift-jis: "illegal multibyte sequence"

I'm trying to decode a shift-jis encoded string, like this: string.decode('shift-jis').encode('utf-8') to be able to view it in my program. When I come across 2 shift-jis characters, in hex "0x87 0x54" and "0x87 0x55", I get this…
ben
  • 71
  • 1
  • 1
  • 2
7
votes
1 answer

Node.js mikeal/request module - Garbled non-utf8 website (Shift_JIS)

I am trying to access a non utf-8 website using request module. Response is garbled for this request. var request = require('request'); request('http://www.alc.co.jp/', function (error, response, body) { if (!error && response.statusCode == 200)…
7
votes
2 answers

How to get the length of Japanese characters in Javascript?

I have an ASP Classic page with SHIFT_JIS charset. The meta tag under the page's head section is like this: My page has a text box (txtName) that should only allow 200…
mark uy
  • 521
  • 1
  • 6
  • 17
6
votes
6 answers

Space-saving character encoding for japanese?

In my opinion a common problem: character encoding in combination with a bitmap-font. Most multi-language encodings have an huge space between different character types and even a lot of unused code points there. So if I want to use them I waste a…
Constantin
  • 8,721
  • 13
  • 75
  • 126
6
votes
1 answer

Trying to read Japanese CSV file in Java

I am trying to read a Japanese content CSV file which is downloaded and extracted pragmatically. Code to read the CSV String splitBy = ","; BufferedReader br;// = new BufferedReader(new FileReader(pathOfExcel + "\\KEN_ALL.CSV ")); …
6
votes
2 answers

C++ ShiftJIS to UTF8 conversion

I need to convert Doublebyte characters. In my special case Shift-Jis into something better to handle, preferably with standard C++. the following Question ended up without a workaround: Doublebyte encodings on MSVC (std::codecvt): Lead bytes not…
easysaesch
  • 159
  • 1
  • 14
5
votes
1 answer

Are there correct encodings for the backslash and tilde characters in Shift_JIS?

Or do these two characters simply not exist in Shift_JIS? The first 128 characters in the Shift_JIS character encoding scheme match ASCII except for two: 0x5C is a Yen symbol (¥) instead of a backslash, and 0x7E is an overline (‾) instead of a…
kshetline
  • 12,547
  • 4
  • 37
  • 73
5
votes
2 answers

can notepad++ auto-detect Shift_JIS encoding

I work alot with Japanese comment source so each times I open a source file, i must do "encoding / character set / ShiftJIS". Can we make Notepad++ auto-detect it ? I've tried alot with options in Settings/Preferences... but didn't find anything…
Luke
  • 1,623
  • 3
  • 24
  • 32
5
votes
1 answer

decode shift-jis in android

How can i decode shift-JIS (convert it to string) in android? i tried something like this but it doesn't work encode: String test = "some text"; byte[] bytes = test.getBytes("Shift_JIS"); decode: String decoded = new String(bytes, "Shift_JIS"); i…
Omar Abdan
  • 1,901
  • 17
  • 29
4
votes
3 answers

UTF-8 support in R on Windows

Since new function 'Beta: Use Unicode UTF-8 for worldwide language support' is added on Windows10, I thought it is possible for R to convert locale environment to UTF-8. However, when I try to change system locale to UTF-8 by Sys.setlocale(locale =…
tragoat
  • 63
  • 1
  • 5
4
votes
3 answers

Pyparsing - parse jascii text from mixed jascii/ascii text file?

I have a text file with mixed jascii/shift-jis and ascii text. I'm using pyparsing and am unable to tokenize such strings. Here is an example code: from pyparsing import * subrange = r"[\0x%x40-\0x%x7e\0x%x80-\0x%xFC]" shiftJisChars =…
a2535009
  • 41
  • 2
4
votes
1 answer

How do I use the SHIFT-JIS encoding in Rust?

According to this Github issue, the rust-encoding crate is missing SHIFT-JIS support. What's the best way to decode SHIFT-JIS in Rust in light of this?
Fredrick Brennan
  • 7,079
  • 2
  • 30
  • 61
4
votes
2 answers

How to detect the character encoding of a file?

Our application receives files from our users, and those files must be validated if they are of the encoding type that we support (i.e. UTF-8, Shift-JIS, EUC-JP), and once that file is validated, we would also need to save that file in our system…
Franz See
  • 3,282
  • 5
  • 41
  • 48
4
votes
1 answer

Japanese COBOL code on IBM mainframe in Shift-JIS; represented after transfer to a PC how?

We have a Japanese client that has source code in COBOL on an mainframe. He claims the code on the mainframe is represented in Shift-JIS2 (and we think we understand that pretty well). When that code is transferred to an PC, what is the most…
Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
3
votes
3 answers

When I create a file with java 8, using the Shift-JIS charset, some chars are substitute with char '?'

I have a problem when I create a file using the Shift-JIS charset. This is an example of text that I want write into a txt file: 繰戻_日経選挙システム保守2019年1月10日~;[2019年度更新]横浜第1DCコロケ―ション(2ラック) Using Shift-JIS charset, into the file I find two '?' instead…
1
2 3 4 5 6 7 8