Jump to content

preg_match input string issue

Recommended Posts

I am doing a very simple string parsing from a url taken from a Textarea inputField. It's all working fine if I define the link as a string, e.g. $str = 'some url', but if it is taken from the CMS it doesn't work...


$str = $link->textarea_short; // this doesn't work 
// Expected Output: '<p>https://www.mixcloud.com/some-radio/</p>'

// $str = '<p>https://www.mixcloud.com/some-radio/</p>'; // this works

preg_match('/<p>https*:\/\/www\.mixcloud\.com(.*)<\/p>/', $str, $matches, PREG_OFFSET_CAPTURE);
$str_url = $matches[1][0];
$str_url = str_replace('/', '%2F', $str_url);


This is the algorithm that I am working with:
1. Getting a url
2. Parsing it to extract what I need
3. Replacing some characters.

In both cases if I echo the values I get the right result but in the preg_match() doesn't work.

What am I doing wrong?

Share this post

Link to post
Share on other sites

Have you checked if $link->textarea_short actually contains the HTML you assume? If the structure is slightly different, the regular expression will not match anymore. Besides that, there are some issues with your expression:

  • https* – If you want to match either http or https, use a question mark. The asterisk will also match httpssssssss...
  • (.*) – This will match any number of any characters, including the paragraph end tag (</p>). Depending on your content, this will yield unexpected results:
    • If your textarea includes multiple paragraphs, the regular expression will match everything between the start of the link to the last closing paragraphs tag.
    • You can rectify that problem using the U flag (PCRE_UNGREEDY) – preg_match is greedy by default, this will turn it ungreedy. But it may still cause Catastrophic Backtracking.
    • A better solution would be to use [^<]*, which will match any characters except the lesser than sign (<), so it can't "skip" the closing tag. It will still capture any additional content between the link and the closing tag though.

What are you trying to do in the first place? Looks like some sort of URL encoding, but why match the <p> tags as well? It would probably be easier to use preg_match_callback with urlencode to find any links and encode them ...

Share this post

Link to post
Share on other sites

Thanks @MoritzLost!
I will fix the issues with my regex now. The $link->textarea_short contains what I expect and it's a string.

What I am trying to do is have the client enter a url from mixcloud, e.g.


and I will render the appropriate mixcloud iframe player. The iframe uses this structure:

<iframe width="100%" height="60" src="https://www.mixcloud.com/widget/iframe/?hide_cover=1&mini=1&feed=%2Ftoddyflores%2Fmatinee-2015-formula-1-grand-prix-mixtape-by-toddy-flores%2F" frameborder="0" ></iframe>

so everything in the feed i want to replace with the parsed channel and track.

that's why:

https://www.mixcloud.com/toddyflores/matinee-2015-formula-1-grand-prix-mixtape-by-toddy-flores/ // becomes
...feed=%2Ftoddyflores%2Fmatinee-2015-formula-1-grand-prix-mixtape-by-toddy-flores%2F...     // this expression

Why would preg_match() refuse to work with my variable $str?

Share this post

Link to post
Share on other sites

If the textarea contains exactly the same as your test string, preg_match will not work differently. There's probably some additional content in there, maybe some whitespace or even a hidden character from pasting or something like this. I'd try dumping both your test string and the value from the database next to each other and check them for differences. Besides that, the most likely cause are the issues with your regex explained above. I'd also recommend using urlencode or rawurlencode instead of str_replace to encode the URL part.

What kind of output are you getting with the value from the database? No match at all, or does it match something it shouldn't?

  • Thanks 1

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Create New...