I am trying to index the content of my site and since there is some javascript inside the <body></body>
, it stores that as well of the content.
It actually gets everything in-between the <body></body>
, but I use PHP's strip_tags to remove the HTML tags.
It removes the <script>
tags, as they are HTML tags, but the javascript syntax remains.
How can I remove the javascript syntax?
Here is example of the content with the javascript syntax in it:
"Watch Later Added to Private videos will be skipped if viewers don't have access, but playlist notes are publicly visible. Back to list Added to playlist: Private videos will be skipped if viewers don't have access, but playlist notes are publicly visible. Add an optional note150 Add note Saving note... Note added to: Error adding note: Click to add a new note if (window.ytcsi) {ytcsi.tick("js_head");} yt.pubsub.subscribe('init', yt.www.brandedpage.channels4init.overviewTabInit); yt.pubsub.subscribe('dispose', yt.www.brandedpage.channels4init.overviewTabDispose); yt.setAjaxToken('c4_shelves_ajax', "0qjmgZRNi5AAlV5LrkVIKyY1_VZ8MTM2ODkyNTgzM0AxMzY4ODM5NDMz");"
How can I get it so that it is just
"Watch Later Added to Private videos will be skipped if viewers don't have access, but playlist notes are publicly visible. Back to list Added to playlist: Private videos will be skipped if viewers don't have access, but playlist notes are publicly visible. Add an optional note150 Add note Saving note... Note added to: Error adding note: Click to add a new note"