-3

I have a big html files with JavaScript. I need to remove all instances of JavaScript that contain the text 'uploadclass' between the script tags. In other words, if the script has 'uploadclass' within it, everything between (and including) the tags and should be deleted.

Nothing I've tried (including I've tried ([^(<]+)(uploadclass,?.*</script>)) gives me exactly what I want. Any suggestions from those RegEx experts?

  • 1
    My suggestion is to use a proper HTML parser for the job. Also, if this is a security measure, it's not in the least bit secure. There are too many ways to bypass it. – code Jun 09 '23 at 00:22
  • Why would you assume anything to do with security? Please explain why you believe an html parser would help. I simply need to have those pieces of script removed from the html file as they are conflicting with other pieces of script which need to be added elsewhere within the file. – user2588110 Jun 09 '23 at 00:39
  • Regex is a bad choice for this sort of thing (HTML parsing). What tool/language are you working with, and what regex patterns have you tried so far? – CAustin Jun 09 '23 at 01:18
  • I only suggested the possibility, I didn't say that you were implementing it for a security measure, although I do believer whatever you're doing, there's a better way to do it. Regarding regex please see https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – code Jun 09 '23 at 01:34
  • https://stackoverflow.com/a/1732454/2588110 was very useful. Mainly to indicate...it depends. https://stackoverflow.com/a/1733489/2588110 has some useful arguments where it depends. I believe my situation falls into that category. – user2588110 Jun 12 '23 at 01:18

2 Answers2

0

You can use the following.

(?s)<script.*>.*uploadclass.*</script\s*>
Reilas
  • 3,297
  • 2
  • 4
  • 17
0

Just in case anyone else runs into this problem, here is what worked for me. ([^()]+)(uploadclass,?.*</script>)