I have to process user-provided markup for a specific kind of embed, which is typically in the form of a <script>
tag, typically with a src
attribute. There are a variety of different <script>
components that can be used here, each one different. However, to avoid potential XSS
attacks, we've deemed it necessary to strip out anything inside the tag.
<script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.min.js">document.write("vinny say something funny"); //This should be sanitized out</script>
DOMDocument really doesn't give us an easy way to alter the innerhtml, and I have seen a few approaches but none seem to address keeping attribute intact if the tag is destroyed. Am I missing something in implementing a best approach, or is there an easier way to go about addressing this?