0

I'm parsing some files that contain invisible characters. The files are structured strangely so that I sometimes have to find real information after 9 or 10 invisible characters. Yeah...

Anyway, I have some files that seem to have invisible characters that my regex doesn't yet know about. Is there some way to pass a character through a function to look up it's character code? Since it's invisible, I don't really have much else to go on, ha.

Currently I'm using the following regular expression to find invisible characters. (found from this question)

public $invisibles='\x00-\x09\x0B\x0C\x0E-\x1F\x7F';
Community
  • 1
  • 1
Chris Barr
  • 29,851
  • 23
  • 95
  • 135

1 Answers1

1

Yours are control characters. But another real invisible character is \xA0 the non-breaking space.

Anyway to find out which is bugging you, first isolate it (substr if you can), and then pass it through ord() to get the ASCII number:

preg_match('/\W/', $str, $match);   // find first non-letter
print dechex(ord($match[0]));

(dechex is for printing it out as hex)

Though really, you should just download a hexeditor for such purposes.

mario
  • 144,265
  • 20
  • 237
  • 291