• I am trying to use the following code to strip the html tags out of a post to offer a text-only version of the post by clicking a link.

    Basically, the script is looking for $document, an html document. However, being php-illiterate, I do not have any idea how to call the_content(__(‘(more…)’)); effectively. When I simply insert $document = the_content(__(‘(more…)’)); I get the original non-stripped html. I am trying to use this code in a page that also has code to display the original html (no problem there), so I dont just want to pass the resulting text into its own file.

    <?php
    // $document should contain an HTML document.
    // This will remove HTML tags, javascript sections
    // and white space. It will also convert some
    // common HTML entities to their text equivalent.

    $search = array (‘@<script[^>]*?>.*?</script>@si’, // Strip out javascript
    ‘@<[\/\!]*?[^<>]*?>@si’, // Strip out HTML tags
    ‘@([\r\n])[\s]+@’, // Strip out white space
    ‘@&(quot|#34);@i’, // Replace HTML entities
    ‘@&(amp|#38);@i’,
    ‘@&(lt|#60);@i’,
    ‘@&(gt|#62);@i’,
    ‘@&(nbsp|#160);@i’,
    ‘@&(iexcl|#161);@i’,
    ‘@&(cent|#162);@i’,
    ‘@&(pound|#163);@i’,
    ‘@&(copy|#169);@i’,
    ‘@&#(\d+);@e’); // evaluate as php

    $replace = array (”,
    ”,
    ‘\1’,
    ‘”‘,
    ‘&’,
    ‘<‘,
    ‘>’,
    ‘ ‘,
    chr(161),
    chr(162),
    chr(163),
    chr(169),
    ‘chr(\1)’);

    $text = preg_replace($search, $replace, $document);

    ?>

Viewing 6 replies - 1 through 6 (of 6 total)
  • the_content_rss();

    get_the_content();

    Ah, should have read the whole post. Mine echoes out the content stripped of html. get_the_content(); doesn’t echo, strip_tags(get_the_content()) will strip html from the content.

    Thread Starter joelwalsh

    (@joelwalsh)

    >>> strip_tags(get_the_content()) will strip html from the content.

    Will it also strip scripts and convert html entities into the corresponding ASCII characters?

    Thanks.

    Thread Starter joelwalsh

    (@joelwalsh)

    OK, so the javascript is still not being stripped, just the <script> tag, even though that code says it removes javascript.

    The script that won’t strip is the “sticky post” script:

    <script type=”text/javascript”>window.document.getElementById(‘post-7’).parentNode.className += ‘ adhesive_post’;</script>

    It comes out like this:

    window.document.getElementById(‘post-7’).parentNode.className += ‘ adhesive_post’; [1]

    Thread Starter joelwalsh

    (@joelwalsh)

    Another thought: will this lingering script invalidate my RSS feeds?

Viewing 6 replies - 1 through 6 (of 6 total)
  • The topic ‘Quick php coding question–how to call a wordpress tag to be used as a variable’ is closed to new replies.