WireTextTools::markupToText() method

Convert HTML markup to readable text

Like PHP’s strip_tags but with some small improvements in HTML-to-text conversion that improves the readability of the text.

In 3.0.197+ inner content of script, style and object tags is now removed, rather than just the tags. To revert this behavior or to remove content of additional tags, see the clearTags option.

Note that this method differs from the Sanitizer::markupToText() method in that this method is newer, more powerful and has more options. But the two methods differ in how they perform markup-to-text conversion so you may want to review and try both to determine which one better suits your needs.

Usage

// basic usage
$string = $wireTextTools->markupToText(string $str);

// usage with all arguments
$string = $wireTextTools->markupToText(string $str, array $options = []);

Arguments

NameType(s)Description
strstring

String to convert to text

options (optional)array
  • keepTags (array): Tag names to keep in returned value, i.e. [ "em", "strong" ]. (default=none)
  • clearTags (array): Tags that should also have their content cleared. (default=[ "script", "style", "object" ]) Since 3.0.197
  • splitBlocks (string): String to split paragraph and header elements. (default="\n\n")
  • convertEntities (bool): Convert HTML entities to plain text equivalents? (default=true)
  • listItemPrefix (string): Prefix for converted list item <li> elements. (default='• ')
  • linksToUrls (bool): Convert links to (url) rather than removing? (default=true) Since 3.0.132
  • linksToMarkdown (bool): Convert links to [text](url) rather than removing? (default=false) Since 3.0.197
  • uppercaseHeadlines (bool): Convert headline tags to uppercase? (default=false) Since 3.0.132
  • underlineHeadlines (bool): Underline headlines with "=" or "-"? (default=true) Since 3.0.132
  • collapseSpaces (bool): Collapse extra/redundant extra spaces to single space? (default=true) Since 3.0.132
  • replacements (array): Associative array of strings to manually replace. (default=[' ' => ' '])

Return value

string

See Also


WireTextTools methods and properties

API reference based on ProcessWire core version 3.0.236

Latest news

  • ProcessWire Weekly #515
    In the 515th issue of ProcessWire Weekly we’ll check out the latest core updates, new modules, and more. Read on!
    Weekly.pw / 23 March 2024
  • Invoices Site Profile
    The new invoices site profile is a free invoicing application developed in ProcessWire. It enables you to create invoices, record payments to them, email invoices to clients, print invoices, and more. This post covers all the details.
    Blog / 15 March 2024
  • Subscribe to weekly ProcessWire news

“…building with ProcessWire was a breeze, I really love all the flexibility the system provides. I can’t imagine using any other CMS in the future.” —Thomas Aull