Jump to content

What's a good way to remove script tags in body through the API?


MarcC
 Share

Recommended Posts

I have a client who had a previous site where they were just pasting analytics code (two types) into TinyMCE, so now all of their bodytext fields are kind of polluted with that stuff. What's a good way to remove this--assuming it'd be best just to do it using the PW API? These are script tags bracketed by HTML comments. 

Link to comment
Share on other sites

maybe regex when you run the search/replace with api..

i'm no regex guru, but i did find this...maybe a start

/<script.*>.*<script src='.+gaAddons.js' type='.+'><\/script>/s

or maybe html dom parser

http://simplehtmldom.sourceforge.net/

also - not sure how many pages there are, but in some cases i have used phpMyAdmin with inline mode and it's pretty fast to get through a lot of content

Link to comment
Share on other sites

Find a regexp that matches comments and script tags and remove them with preg_replace() from all the bodytext fields. If you already imported everything to PW, you could do something like:

foreach($mypages as $p){
    $p->of(true);
    $p->body = preg_replace($pattern_for_script_tags,'', $p->body);
    $p->body = preg_replace($pattern_for_html_comments,'', $p->body);
    $p->save()
}
Link to comment
Share on other sites

$body = preg_replace('{<script[^>]*>.*?</script>}is', '', $body);
$body = preg_replace('{<!--.*?-->}is', '', $body); 

The key here is to change the default "greedy" matching to be "lazy" matching using the .* followed by a question mark: .*?

That ensures that it will match only to the closest closing tag rather than the [default] furthest one. That way it won't wipe out legitimate copy. 

Also the "s" at the very end lets it traverse as many lines as needed to complete the match. Without that, it would only match opening and closing tags on the same line. 

  • Like 3
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...