0

I am reading and manipulating data from a series of text files. I noticed that my function for trying to break up some of the data by a space doesn't always work.

For instance I have

MOST_RECENT_RESULT 100 as part of my text file. I have been using

$pos = strrpos($string, ' ')

and echoing out where the last space from the right is to check it is working ok. This works for most of the files. But some have blank positions returned, which I found odd. So I copied the string from my browser and directly ran it in a script and it then returned the correct position. Which leads me to believe that there is some kind of whitespace code or something i'm missing here when accessing the files, I have tried \r \v \n to no avail. How do I find out exactly what character is in the string. If that is indeed the actual problem?

Jonnny
  • 4,939
  • 11
  • 63
  • 93
  • Do you have access to the text file? Are you on a *nix machine? – Jason McCreary Oct 09 '14 at 18:41
  • @JasonMcCreary Yes I have the files locally. I'm just trying to parse a load of them and collate the info. – Jonnny Oct 09 '14 at 18:42
  • 2
    hex or octal dump of the file? – Richard Chambers Oct 09 '14 at 18:42
  • If you have the text in a variable, you can always use the `preg_split` function [Using Preg_Split With Multiple Spaces](http://stackoverflow.com/questions/7961599/using-preg-split-with-multiple-spaces) – Crisoforo Gaspar Oct 09 '14 at 18:42
  • What are you actually parsing here? You say "So I copied the string from my browser and directly ran it in a script" -- are you talking about your web browser? – i alarmed alien Oct 09 '14 at 18:43
  • @RichardChambers I have no idea i'm afraid – Jonnny Oct 09 '14 at 18:43
  • If you are on *nix, run `od -c thefile.txt`. See what the character is. – Jason McCreary Oct 09 '14 at 18:44
  • @Jonnny, my bad. what I meant to say is what does a hex dump of a sample file show? That should indicate if you have any kind of special characters. – Richard Chambers Oct 09 '14 at 18:44
  • @ialarmedalien I have been just printing out the contents of the files as read by PHP to make sure I am getting the relavent parts. Using things like explode etc to narrow down the text. I took the string as it appeared in my browser and hard coded the string into a script, to see if I'd missed something or the "space" wasn't actually a space, it was two spaces or something else – Jonnny Oct 09 '14 at 18:45
  • @JasonMcCreary it's a windows machine – Jonnny Oct 09 '14 at 18:46
  • more than likely you are going to need to use regular expressions to handle a series of white space if you are wanting to break the text up into substrings where the substrings are tokens of non-white space characters. – Richard Chambers Oct 09 '14 at 18:47
  • @RichardChambers I ran `bin2hex` and got the string, how would I now find out what the character would be? – Jonnny Oct 09 '14 at 18:51
  • with the hex dump you then look at the individual bytes and compare against a hex table for the character set you are using. For instance if it is standard ANSI character set with each byte a unique character you would use an ANSI character hex table. take a look at this write up on the UTF-8 character set http://en.wikipedia.org/wiki/UTF-8 which provides a basic idea. – Richard Chambers Oct 09 '14 at 19:02

2 Answers2

0

This should work. You can adjust the str_replace() array to include the characters as needed to detect the position of the specific type of white space you're trying to detect:

 $string = "MOST_RECENT_RESULT 100";
  $string = str_replace(array("\r", "\n", "\r\n", "\v", "\t", "\0","\x"), " ", $string);
  $pos = strrpos($string, ' ');
  echo $pos;
AnchovyLegend
  • 12,139
  • 38
  • 147
  • 231
-1

I think i've solved it, or at least moved it along:

$new = preg_replace('/[\s\W]+/', ' ', $value);
$position = strrpos($new, ' ');

Converts non characters to a literal space.

Jonnny
  • 4,939
  • 11
  • 63
  • 93