Jump to content

[solved] Regex help needed


Robin S
 Share

Recommended Posts

My knowledge of regular expressions is pretty weak - one of the things I need to improve on. If anyone can offer any help for the following scenario it would be much appreciated.

I need to import a heap of content that contains Smarty tags for encoded email addresses. When I import the content I want to convert these tags to regular mailto links. An example tag:

{mailto address="someone@domain.com" encode="javascript" text="link text" subject="The subject line"}

And another:

{mailto address="someone@domain.com" encode="javascript"}

I have got some way with this but am falling down when it comes to optional capturing groups.

The requirements for my regex are:

  • The regex must match {mailto all the way through to } (because I want to ultimately remove the whole tag)
  • The address parameter must exist, and I want to capture its contents: someone@domain.com
  • The text parameter may exist, and I want to capture its contents if it does exist: link text
  • The subject parameter may exist, and I want to capture its contents if it does exist: The subject line

Thanks in advance!

Link to comment
Share on other sites

I started making this so it puts it in an anchor tag <a href="mailto:"> but then I realised I was guessing what "link text" and what "The subject line" is. Instead I've just done the function what will split them up into an array. For simplicity it also adds the `$matches[0]["attr"]` and `$matches[0]["value"]`. You can foreach them and just use `$match["attr"]`.

function mailtoArray($input) {
	preg_match_all('/(?<attr>[\w]+)=\"(?<value>.[^\"]+)\"/i', $input, $matches, PREG_SET_ORDER);
	return $matches;
}
$matches = mailtoArray('{mailto address="someone@domain.com" encode="javascript" text="link text" subject="The subject line"}');
print_r($matches);

 

  • Like 2
Link to comment
Share on other sites

That's great, thanks!

Just a couple more things and I'll have it sorted:

1. I need the match to be specific to the {mailto...} tag. That regex would currently also match {foo animal="cat"}

2. I need one of the matches to be the entire tag, so I can replace the tag with my reconstructed mailto link.

Any ideas how it could be modified for those two objectives?

Link to comment
Share on other sites

1 hour ago, Robin S said:

That's great, thanks!

Just a couple more things and I'll have it sorted:

1. I need the match to be specific to the {mailto...} tag. That regex would currently also match {foo animal="cat"}

2. I need one of the matches to be the entire tag, so I can replace the tag with my reconstructed mailto link.

Any ideas how it could be modified for those two objectives?

$string =  '{mailto address="someone@domain.com" encode="javascript" text="link text" subject="The subject line"}';
function mailtoArray($input) {
	preg_match_all('/(?<attr>[\w]+)=\"(?<value>.[^\"]+)\"/i', $input, $matches, PREG_SET_ORDER);
	return $matches;
}
if (strpos($string, '{mailto') !== false) {
  $matches = mailtoArray($string);
  print_r($matches);
}

I would do it that way. One of the matches is the entire tag, it's [0], then [1] or ['attr'] is group match 1 and [2] or ['value'] is group match 2. However you can do a string replace on just match 1 to format it to an anchor tag. 

  • Like 3
Link to comment
Share on other sites

If I understand the question correctly, you might be looking for a regex capture collection, but I think support for that is really rare and am pretty sure PHP/PCRE does not have such a thing. So I think you'd have to use a couple of regular expressions here: one to find all the {mailto...} tags, and another to isolate their attributes into independent arrays. That's because presumably the attributes can appear in any order and some may or may not be present. Here's how you might do it:

function findSmartyMailtoTags($markup) {
  $tags = [];
  if(!preg_match_all('/\{mailto(\s+.*?\baddress=".*?")\}/', $markup, $a)) return array();
  foreach($a[0] as $key => $tag) {
    $attrs = [ 'address' => '', 'text' => '', 'subject' => '' ];
    if(!preg_match_all('/\s+(?<name>\w+)="(?<value>[^"]*)"/', $a[1][$key], $b, PREG_SET_ORDER)) continue;
    foreach($b as $attr) {
      $attrs[$attr['name']] = $attr['value'];
    }
    $tags[] = [ 'tag' => $tag, 'attrs' => $attrs ];
  }
  return $tags;
}

If you wanted to do it without a regular expression, you could do this:

function findSmartyMailtoTags($markup) {
  $tags = [];
  foreach(explode('{mailto ', $markup) as $cnt => $line) {
    if(!$cnt || false === ($pos = strpos($line, '"}'))) continue;
    $mailto = $line = substr($line, 0, $pos+1);
    $attrs = [ 'address' => '', 'text' => '', 'subject' => '' ];
    while(strlen(trim($line))) {
      list($name, $line) = explode('="', $line, 2);
      list($value, $line) = explode('"', $line, 2);
      $attrs[trim($name)] = trim($value);
    }
    if(!empty($attrs['address'])) {
      $tags[] = [ 'tag' => "{mailto $mailto}", 'attrs' => $attrs ];
    }
  }
  return $tags;
}

In either case, either version would return exactly the same thing (print_r result): 

Array
(
  [0] => Array
    (
      [tag] => {mailto address="someone@domain.com" encode="javascript" text="link text" subject="The subject line"}
      [attrs] => Array
        (
          [address] => someone@domain.com
          [text] => link text
          [subject] => The subject line
          [encode] => javascript
        )
    ),
  [1] => Array
    (
      [tag] => {mailto address="hey@hello.com" text="Hi" subject="Welcome friend"}
      [attrs] => Array
        (
          [address] => hey@hello.com
          [text] => Hi
          [subject] => Welcome friend
        )
    )
)

 

  • Like 7
Link to comment
Share on other sites

My final code for {mailto} tag replacement:

$tags = findSmartyMailtoTags($markup); // Ryan's regex function as shown above

foreach($tags as $tag) {
    $href= 'mailto:' . $tag['attrs']['address'];
    if($tag['attrs']['subject']) $href .= '?subject=' . rawurlencode($tag['attrs']['subject']);
    $link_text = $tag['attrs']['text'] ?: $tag['attrs']['address'];
    $replacement = "<a href='$href'>$link_text</a>";
    $markup = str_replace($tag['tag'], $replacement, $markup);
}

 

  • Like 3
Link to comment
Share on other sites

 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...