WireTextTools::markupToText() method

Convert HTML markup to readable text

Like PHP’s strip_tags but with some small improvements in HTML-to-text conversion that improves the readability of the text.

In 3.0.197+ inner content of script, style and object tags is now removed, rather than just the tags. To revert this behavior or to remove content of additional tags, see the clearTags option.

Note that this method differs from the Sanitizer::markupToText() method in that this method is newer, more powerful and has more options. But the two methods differ in how they perform markup-to-text conversion so you may want to review and try both to determine which one better suits your needs.

Usage

// basic usage
$string = $wireTextTools->markupToText(string $str);

// usage with all arguments
$string = $wireTextTools->markupToText(string $str, array $options = []);

Arguments

NameType(s)Description
strstring

String to convert to text

options (optional)array
  • keepTags (array): Tag names to keep in returned value, i.e. [ "em", "strong" ]. (default=none)
  • clearTags (array): Tags that should also have their content cleared. (default=[ "script", "style", "object" ]) Since 3.0.197
  • splitBlocks (string): String to split paragraph and header elements. (default="\n\n")
  • convertEntities (bool): Convert HTML entities to plain text equivalents? (default=true)
  • listItemPrefix (string): Prefix for converted list item <li> elements. (default='• ')
  • linksToUrls (bool): Convert links to (url) rather than removing? (default=true) Since 3.0.132
  • linksToMarkdown (bool): Convert links to [text](url) rather than removing? (default=false) Since 3.0.197
  • uppercaseHeadlines (bool): Convert headline tags to uppercase? (default=false) Since 3.0.132
  • underlineHeadlines (bool): Underline headlines with "=" or "-"? (default=true) Since 3.0.132
  • collapseSpaces (bool): Collapse extra/redundant extra spaces to single space? (default=true) Since 3.0.132
  • replacements (array): Associative array of strings to manually replace. (default=[' ' => ' '])

Return value

string

See Also


WireTextTools methods and properties

API reference based on ProcessWire core version 3.0.200

Twitter updates

  • Weekly update: Smooth rollout for ProcessWire 3.0.200 main/master, plus getting started on a new WP-to-PW site conversion and why they can be so rewarding— More
    20 May 2022
  • New main/master version of ProcessWire that has more than 220 commits, resolves more than 80 issues, adds numerous new features, performance improvements and optimizations, and consumes HALF the disk space of our previous release— More
    13 May 2022
  • Summary of weekly core updates for 22 April 2022 More
    22 April 2022

Latest news

  • ProcessWire Weekly #419
    In issue 419 of ProcessWire Weekly we'll check out this week's core updates, introduce a brand new recipe of the week, and more. Read on!
    Weekly.pw / 21 May 2022
  • ProcessWire 3.0.200 new master/main version
    This new main/master version has more than 220 commits, resolves more than 80 issues, adds numerous new features, performance improvements and optimizations, and consumes HALF the disk space of our previous release. This post covers all the details.
    Blog / 13 May 2022
  • Subscribe to weekly ProcessWire news

“Yesterday I sent the client a short documentation for their ProcessWire-powered website. Today all features already used with no questions. #cmsdoneright—Marc Hinse, Web designer/developer