0

I hope that everyone will be in perfect health.!

I tried with str_replace but didn't get any success at all..!

What I want is to remove all these & amp; quot; from this whole text as in a very efficient way as I don't want to mess with the script and these HTML Special Chars really mess it and it doesn't work now as these are too much so how to replace all these with "" nothing :

#!/usr/bin/perl
# This script is NOT written or modified by me, I only copy pasted it from the internet.
# It was First originally Written by chudy_fernandez@yahoo.com
# & Have been modified by various persons over the net to fix/add various functions.
# Like For Example modified by member of comstuff.net to satisfy common and dynamic content.
# th30nly @comstuff.net a.k.a invisible_theater , Syaifudin JW , Ucok Karnadi and possibly other people too.
# For more info, http://wiki.squid-cache.org/ConfigExamples/DynamicContent/YouTube
# Syed Jahanzaib / aacable@hotmail.com
# http://aacable.wordpress.com/2012/01/19/youtube-caching-with-squid-2-7-using-storeurl-pl/
#######################
# Special thanks to some indonesian friends who provided some updates,
## UPDATED on 20 January, 2014 / Syed Jahanzaib

#####################
#### REFERENCES ##### http://www2.fh-lausitz.de/launic/comp/misc/squid/projekt_youtube/
#####################
#####################

use IO::File;
$|=1;
STDOUT->autoflush(1);
$debug=1; ## recommended:0
$bypassallrules=0; ## recommended:0
$sucks=""; ## unused
$sucks="sucks" if ($debug>=1);
$timenow="";
$printtimenow=1; ## print timenow: 0|1
my $logfile = '/tmp/storeurl.log';
Shadow The GPT Wizard
  • 66,030
  • 26
  • 140
  • 208
Umair Shah
  • 2,305
  • 2
  • 25
  • 50

2 Answers2

1

You could always do something like this

Find: &(?:(?:amp|gt|lt|quot);)+
Replace: ''

edit
For the extra part, a php sample

$strNewSrc = preg_replace_callback('~http://([^;\r\n]+);~',
     function( $matches )
     {
         $tmp = $matches[1];
         $tmp = preg_replace('~\s+~', '', $tmp );
         $repl = "http://" + $tmp + ";";
         return $repl;
     },
     $strSrc
 );
  • Great..Works in Sublime Text :D Besides I also want to remove extra spaces or space between in links like `http:// . . . / . $y . .` any regex will be great for that also please..! – Umair Shah Jul 25 '16 at 18:46
  • Yeah, after you run the above, use this -Use a general regex replace with callback. Find: `http://([^;\r\n]+);` Replace with "http://" + RemoveWsp($1) + ";" –  Jul 25 '16 at 18:58
  • Here is Pastebin for more clear example : http://pastebin.com/icBRa9t4 – Umair Shah Jul 25 '16 at 19:01
  • My suggestion above would turn `75.$out=http://www.fileserve.com.SQUIDINTERNAL/ . $1 . ./ . $2 . ;` into `75.$out=http://www.fileserve.com.SQUIDINTERNAL/.$1../.$2.;` However, it requires a callback feature, which is probably not available in text editor apps. –  Jul 25 '16 at 19:06
  • After applying above regex my script turned into this..It messed everything..! : http://pastebin.com/bhwDrNXD – Umair Shah Jul 25 '16 at 19:12
  • Can you provide the correct regex for replacement – Umair Shah Jul 25 '16 at 19:12
  • Added a php code sample for the last part. But, a callback is necessary. It could be done without a callback but is fairly difficult. –  Jul 25 '16 at 19:39
1

The commented steps here could help you achieve that:

<?php

    // GET & DUMP THE CONTENTS OF THE FILE INTO A VARIABLE: $fileContents
    $fileContents   = file_get_contents(__DIR__ . "/_data_1.txt");

    // CLEAN UP THE TEXTS (REMOVING: quot; AND OR amp; AND OR &amp;)
    $cleanText      = preg_replace("#((quot\;)*?|(\&amp\;)|(amp\;)*?)#si", "", $fileContents);

    // RE-SAVE THE FILE
    file_put_contents(__DIR__ . "/_data_2.txt", $cleanText);
    var_dump($cleanText);
Poiz
  • 7,611
  • 2
  • 15
  • 17