Questions tagged [non-ascii-characters]

ASCII stands for 'American Standard Code for Information Interchange'. ASCII is a character-encoding scheme based on the ordering of the English alphabet. Since ASCII only contains definitions for 128 characters, numerous other encoding schemes have been created to include characters from other alphabets and other symbols.

1055 questions
210
votes
10 answers

(grep) Regex to match non-ASCII characters?

On Linux, I have a directory with lots of files. Some of them have non-ASCII characters, but they are all valid UTF-8. One program has a bug that prevents it working with non-ASCII filenames, and I have to find out how many are affected. I was going…
Amandasaurus
  • 58,203
  • 71
  • 188
  • 248
178
votes
9 answers

How do I remove all non-ASCII characters with regex and Notepad++?

I searched a lot, but nowhere is it written how to remove non-ASCII characters from Notepad++. I need to know what command to write in find and replace (with picture it would be great). If I want to make a white-list and bookmark all the ASCII…
Texh
  • 1,823
  • 2
  • 12
  • 8
128
votes
20 answers

Replacing accented characters php

I am trying to replace accented characters with the normal replacements. Below is what I am currently doing. $string = "Éric Cantona"; $strict = strtolower($string); echo "After Lower: ".$strict; $patterns[0] = '/[á|â|à|å|ä]/'; …
Lizard
  • 43,732
  • 39
  • 106
  • 167
123
votes
6 answers

Remove non-ascii character in string

var str="INFO] :谷���新道, ひば���ヶ丘2丁���, ひばりヶ���, 東久留米市 (Higashikurume)"; and i need to remove all non-ascii character from string, means str only contain "INFO] (Higashikurume)";
Dev
  • 3,410
  • 4
  • 17
  • 16
117
votes
1 answer

SyntaxError of Non-ASCII character

I am trying to parse xml which contains the some non ASCII cheracter, the code looks like below from lxml import etree from lxml import objectify content = u'
Order date                            :…
OpenCurious
  • 2,916
  • 5
  • 22
  • 25
82
votes
5 answers

Removing non-ASCII characters from data files

I've got a bunch of csv files that I'm reading into R and including in a package/data folder in .rdata format. Unfortunately the non-ASCII characters in the data fail the check. The tools package has two functions to check for non-ASCII characters…
Maiasaura
  • 32,226
  • 27
  • 104
  • 108
80
votes
9 answers

Why is this LSEP symbol showing up on Chrome and not Firefox or Edge?

So this web page is rendering with these symbols and they are found throughout this website/application but on no other sites. Can anyone tell me What this symbol is? Why it is showing up only in one browser?
Joseph
  • 1,047
  • 1
  • 10
  • 24
77
votes
8 answers

Find non-ASCII characters in varchar columns using SQL Server

How can rows with non-ASCII characters be returned using SQL Server? If you can show how to do it for one column would be great. I am doing something like this now, but it is not working select * from Staging.APARMRE1 as ar where ar.Line like…
Gerhard Weiss
  • 9,343
  • 18
  • 65
  • 67
57
votes
6 answers

Using JavaScript to perform text matches with/without accented characters

I am using an AJAX-based lookup for names that a user searches in a text box. I am making the assumption that all names in the database will be transliterated to European alphabets (i.e. no Cyrillic, Japanese, Chinese). However, the names will still…
Philip
  • 3,689
  • 3
  • 24
  • 35
57
votes
1 answer

How to handle Asian characters in file names in Git on OS X

I'm on US-English OS X 10.6.4 and try to store files with Asian characters in its name in a Git repository. OK, let's create such a file in a Git working tree: $ touch どうもありがとうミスターロボット.txt Git is showing it as octal-escaped UTF-8 form: $ git…
Mot
  • 28,248
  • 23
  • 84
  • 121
55
votes
10 answers

How to fetch a non-ascii url with urlopen?

I need to fetch data from a URL with non-ascii characters but urllib2.urlopen refuses to open the resource and raises: UnicodeEncodeError: 'ascii' codec can't encode character u'\u0131' in position 26: ordinal not in range(128) I know the URL is…
onurmatik
  • 5,105
  • 7
  • 42
  • 67
42
votes
1 answer

Removing unicode \u2026 like characters in a string in python2.7

I have a string in python2.7 like this, This is some \u03c0 text that has to be cleaned\u2026! it\u0027s annoying! How do i convert it to this, This is some text that has to be cleaned! its annoying!
40
votes
3 answers

Finding the Values of the Arrow Keys in Python: Why are they triples?

I am trying to find the values that my local system assigns to the arrow keys, specifically in Python. I am using the following script to do this: import sys,tty,termios class _Getch: def __call__(self): fd =…
Newb
  • 2,810
  • 3
  • 21
  • 35
39
votes
6 answers

How do I write non-ASCII characters using echo?

How do I write non-ASCII characters using echo? Is there an escape sequence, such as \012 or something like that? I want to append ASCII characters to a file using: echo ?? >> file
flybywire
  • 261,858
  • 191
  • 397
  • 503
36
votes
2 answers

Why does wprintf transliterate Russian text in Unicode into Latin on Linux?

Why does the following program #include #include int main() { wprintf(L"Привет, мир!"); } print "Privet, mir!" on Linux? Specifically, why does it transliterate Russian text in Unicode into Latin as opposed to transcoding it…
vitaut
  • 49,672
  • 25
  • 199
  • 336
1
2 3
70 71