1

When I'm parsing a site, and print out the plaintext, I get a lot of newline characters in the printout that can't be removed via a str_replace.

For example, if I parse eBay and look for the end time,

$ebayEndTime = $this->html_simple_dom->find( 'span[class=vi-tm-left]', 0 )->plaintext;

In the print out, it looks like this:

\t\t\t\t\t(Mar 19, 2013\n\t\t\t\t\t15:10:11 PDT)\n\t\t\t

Trying to remove it has no effect:

$search = array('\n', '\t', '\r');
error_log("end time:" .  str_replace( $search, " ", $ebayEndTime));

Still results in:

\t\t\t\t\t(Mar 19, 2013\n\t\t\t\t\t15:10:11 PDT)\n\t\t\t

What do I need to do to remove the newlines/tabs? I've even tried this to be thorough:

$search = array('\n', '\t', '\r', '\\n', '\\t', '\\r', '\\\\n', '\\\t', '\\\r', '\\\\n', '\\\\t', '\\\\r');

As I know that Java requires that the escape character be escaped, but as this prints in the log file, is it printing it out as using the html code for '\'?

Jerry Skidmore
  • 400
  • 2
  • 7
  • 20
  • I did try nl2br() as it was something I hadn't already tried, but it didn't work either, it seems that it's getting printed out using special characters - otherwise, I believe my str_replace call would have fixed it. if I copy and past in the string with the newlines/tab's and run it through the str_replace call that does work, so it would seem that it's getting printed out using special characters – Jerry Skidmore Mar 20 '13 at 00:12
  • This answers the question...so why didn't my str_replace do it - did it actually require regular expressions? http://stackoverflow.com/questions/1703320/php-remove-whitespace-from-within-a-string – Jerry Skidmore Mar 20 '13 at 00:25
  • The problem was that single quoted strings are treated literally. You need double quotes for interpolation. – pguardiario Apr 26 '13 at 04:40

1 Answers1

0

How about:

$str = "\t\t\t\t\t(Mar 19, 2013\n\t\t\t\t\t15:10:11 PDT)\n\t\t\t";
echo trim(preg_replace('/\s+/', ' ', $str));
#=>(Mar 19, 2013 15:10:11 PDT)
pguardiario
  • 53,827
  • 19
  • 119
  • 159