Jump to content

URL Shortener


netcarver
 Share

Recommended Posts

Thanks to Ryan, I've managed to knock my next module into usable shape.

The URL Shortener adds a link shortening feature to ProcessWire, so you can host your own short URL service from a PW site now.

You can create as many bins for short links as you need & the module sets up an example bin when you install it. Each bin is a PW page that uses the LinkShortenerHome template. This template allows you to set the length of the shortened links that will reside in it. Shortened links are simply child pages that automatically use the LinkShortener template. As these links are normal PW pages you can manipulate them from the admin page tree just like any other page.

Anytime you create a new short-link page in any of your bins, it will automatically be named with a random string that is not already in use in that bin. You get the chance to review this short string before adding the full URL and saving the page. Once the page is saved any visit to the short link's URL will be redirected to the full URL.

  • Like 9
Link to comment
Share on other sites

Just for inspiration: My version of an URL-Shortener (kind of older code previously featured on my blog):


require_once('database.class.php');
$db = new database('localhost', '', '', '');
$pass = '123';
$base_url = 'http://go.nico.is/';

function is_url($url) {
if(!preg_match("#^http(s)?://[a-z0-9-_.]+\.[a-z]{2,4}#i", $url)) {
return false;
} else {
return true;
}
}

function url_encrypt($id) {
$codeset = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
$base = strlen($codeset);
$converted = "";

while ($id > 0) {
$converted = substr($codeset, ($id % $base), 1) . $converted;
$id = floor($id/$base);
}

return $converted;
}

function url_decrypt($converted) {
$codeset = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
$base = strlen($codeset);
$c = 0;

for ($i = strlen($converted); $i; $i--) {
$c += strpos($codeset, substr($converted, (-1 * ( $i - strlen($converted) )),1))
* pow($base,$i-1);
}

return $c;
}

if($_GET['do'] != '' && isset($_GET['url']) && $_GET['pass'] == $pass) {
$url = rawurldecode($_GET['url']);

if($_GET['do'] == 'remove') {
if($db->delete('urls', '`url`=\''.$url.'\'')) {
$note = 'true';
} else {
$note = 'false';
}
} elseif($_GET['do'] == 'stats') {
$note = (($db->get_element('urls', 'stats', '`url`=\''.$url.'\'')) ? $db->get_element('urls', 'stats', '`url`=\''.$url.'\'') : '0');
} elseif($_GET['do'] == 'add') {
if(!$db->get_element('urls', 'converted', '`url`=\''.$url.'\'')) {
$id = $db->get_next_id('urls');

if($db->insert('urls', array('id' => $id, 'url' => $url, 'converted' => $converted, 'stats' => 0))) {
$note = $base_url.url_encrypt($id);
} else {
$note = 'false';
}
} else {
$note = $base_url.$db->get_element('urls', 'converted', '`url`=\''.$url.'\'');
}
}
} elseif(isset($_GET['converted'])) {
$id = url_decrypt($_GET['converted']);
$url = $db->get_element('urls', 'url', '`id`=\''.$id.'\'');
$db->update('urls', array('stats' => ((int)$db->get_element('urls', 'stats', '`id`=\''.$id.'\'') + 1)), '`id`=\''.$id.'\'');

header('HTTP/1.1 301 Moved Permanently');
header('Location: '.$url);
exit;
}

if($note != '') {
echo $note;
}
  • Like 1
Link to comment
Share on other sites

@Ryan,

Is there any way to prevent PW from using all lowercase characters in a page's path? For this module, allowing uppercase characters in the shortened URL would really increase the bin sizes -- or allow shorter links for a given bin capacity. Currently I use a 31 character alphabet for generating the short links giving a maximum of 887,503,681 links using 6 characters. If I could add 19 visually distinct uppercase characters to the alphabet this would increase to 15,625,000,000.

Link to comment
Share on other sites

I think it might be a fairly major change to the PW core to make it work with uppercase characters in URLs built on page names (though not 100% certain). But I think there are a lot of good reasons to normalize the case for URLs, so it's a change that might not be ideal for the larger system. Obviously those reasons don't apply to a URL shortener, and I can totally see the benefits for the URL shortener. The best way may be to have some kind of translation system in place where an underscore preceding a page name means the following character is uppercase, or something like that. Then your module's __construct() method could modify the value of $_GET['it'] before PW gets to it, i.e.

if(strpos($_GET['it'], '/r/') === 0) {

// if URL matches the format you are looking for

// replace uppercase character with the lowercase version preceded by an underscore

$_GET['it'] = strtolower(preg_replace('/([A-Z])/', '_$1', $_GET['it']));

}

Btw, that $_GET['it'] is set by your htaccess file and it's where ProcessWire looks for the URL from Apache's RewriteEngine. It's nothing more than the currently requested URL.

Another thing I want to mention is that there is no lowercase restriction with URL segments. So if you had your LinkShortenerHome.php template allowing URL segments, you could do something like this:

if($input->urlSegment1) {
 $name = strtolower(preg_replace('/([A-Z])/', '_$1', $input->urlSegment1);
 $p = $page->child("name=$name"); 
 if($p->id) $session->redirect($p->full_link); 
   else throw new Wire404Exception();
}

Also using URL segments, Another alternative is that you could just use your own custom text field to hold the short link (using upper and lower), and just let the page name take on the page's ID or something that wouldn't crossover with your short link names.

Another idea, but maybe for a future version. :) If you wanted to handle short links right from your homepage rather than /r/ (which would further shorten the URL), you could hook in before Page::render when the $page->id is 1 and have your module check if there is an $input->urlSegment1 present.

  • Like 1
Link to comment
Share on other sites

Thanks for the reply Ryan, lots of good ideas in there. I agree that there is no real benefit for the request other than for my module so absolutely no need to worry about core mods.

I did consider using URL segments for this module in the beginning but assumed, in ignorance, that enabling URL segments on the bin template would preclude me having the links as child pages and, as I definitely wanted to have the links as child pages, I chose not to explore that route. It's also good to know that URL segments can have uppercase characters. I'll probably look at moving over to a URL segment based solution for version 2 of this module. For now I've moved on to another module project. :)

Link to comment
Share on other sites

Hi Marty,

I must admit I hadn't considered using the URL shortener for links to the same site so that's an interesting use-case for me. I've just tried this locally and it's not as simple as just changing the type of the full_link field as you then need to go on and setup some of its more advanced options and tell it where in the page tree it can reference pages etc. I will think about this some more and get back to you.

Link to comment
Share on other sites

@steve - Cheers - no rush at all. Kind of a side point: I have a YOURLS setup on a short domain that I use through my twitter account which gives me short links like: http://stlmv.in/4l. I'm not a fan of 3rd party url shorteners because there's no guarantee they'll bee around in the future. YOURLS just works. Perhaps I'll sponsor a module that can automatically create a shortened URL from your YOURLS install. The API is here if anyone is interested: http://yourls.org/#API. And other side point is that some of my web clients actually print newsletters (crazy, I know) and reference urls in them. Having shorter urls makes that easier.

@antti: Oh yeah. I never thought to use it like that! :)

Regards

Marty

  • Like 2
Link to comment
Share on other sites

Once installed, you'll see a new "Short Links" page under your site root. This is a bin for storing shortlinks - which are simply child pages.
 

post-465-0-86735300-1462360106_thumb.png

If you edit the "Short Links" page you'll see various parameters that control the automatic generation of link names. You should be able to leave these as-is with no problems (unless you need more than 27,000,000,000+ links in your bin!) The parameters, if curious, are...

post-465-0-08976200-1462360108_thumb.png

You can add as many bins for shortlinks as you like to your site, just create a new page using the template "LinkShortenerHome" and give it a relatively short page name.

Anyway, let's add a short link now. From the page tree just click new next to the "Short Links" page. This creates a new shortlink page (uses the "LinkShortener" template) and automatically creates an unused, random, page name using the parameters we just looked at from the parent bin. Something like this...

post-465-0-94833100-1462360105_thumb.png

If you are not happy with the spelling of the generated shortlink (if it happens to be offensive in your native language for example) then you can change it now and hit the save button which takes us to...
 

post-465-0-50278000-1462360104_thumb.png

..where we just type/paste in the target for the redirect this shortlink will make and then hit the publish button...

post-465-0-14186000-1462360103_thumb.png

... and we can see the short link and a "Go!" link for testing it.
 

post-465-0-02952800-1462360109_thumb.png

That's about all there is to it except to say that it will help out the ProcessRedirects module if it is installed and you have configured the URL Shortener to help generate names for it.

HTH.

  • Like 4
Link to comment
Share on other sites

I have a quick question (sorry if it was covered I missed it).

If I install Processwire in a directory on my site, let's just say a folder named 'i'.

Would there be a way then that the path to the short link would not include the i in the path, but be made to just use the domain root?

For example:

Crssp.com/i/ujn3gi

vs.

Crssp.com/ujn3gi

Not really a big deal, I'm thinking I will install Processwire in a folder, maybe I should even label the directory pw. :)

Link to comment
Share on other sites

Hmmm, probably not. However, here is an untested idea you could try if the only thing you want at your site root is the link shortener: set the template of the home page to LinkShortenerHome and use that as the bin for shortlinks.

Link to comment
Share on other sites

Have there been any thoughts on how or if this could be a publically available service?

I don't think I would ever setup a URL for public use, with so many out there, but would it be doable.

The idea now is it is more of a backend resource rather than frontend, does that sound right?

Link to comment
Share on other sites

  • 2 months later...

Any new activity or use on this one.

Just wondering if anyone if finding good uses, or is using the module.

I got an install of LessN More running today, it has an easy interface.

My only real use case, now is to shorten links in a Tweet, and it will be novel to have my own.

On the backburner will be toying with the module.

It would be useful to shorten the path for news articles on a new beta site, if I can ever manage to get a beta in the wild.

Link to comment
Share on other sites

  • 7 months later...

Hi,
I am trying to get url's mode like www.site.com/artilce instead of www.site.com/parent/page/article. Can I achieve this with the plugin?  So far I don't see such an opportunity. If not, can anyone advice how to do that if possible..

Link to comment
Share on other sites

@xeo, what you want to do is not the intention of the module. In PW the urls follow the structure of the tree, if you would shorten your urls like you are saying, there would be problems with pages overriding other with the same name. Imagine that you have a page called "About" in the root of the website (mydomain.com/about), and you would write an article with the title "About". You would have a problem because they would share the same url.

If you want to have short urls for your articles, you can put them under a parent that is on the root, and has a short name (not short title)

-home
--about
--contacts
--articles (name: a)
---article1
---article2
 

This way, your article1 url would be mydomain/a/article1

If you want to have an even shorter url for the articles (avoiding urls like: mydomain/a/my-new-article-with-a-very-very-veeeeeeeeery-long-title), you can create a system that accepts their IDs on the URL (mydomain/a/123):

<?php

// on top of the template file of the "articles" page ("/a/")
// url segments have to be activated on this template

if ($input->urlSegments) { // if there are segments in the URL

    $id = (int)$input->urlSegment1; // get the first segment ant turn it into an integer
    $myURL = $pages->get("/a/")->find("id={$id}")->first()->url; // get the url of the page with the id in the segment

    if ($myURL && !$input->urlSegment2) { // if page exists and there's not a second segment
        $session->redirect($myURL); // redirect to the correct page
    } else {
        throw new PageNotFoundException(); // if not, throw a 404
    }
}

  • Like 4
Link to comment
Share on other sites

Thanx for the explanation. The problem is not about  the length of urls as such. The reason I want those urls is that pages or articles can be moved from on parent page to another. For example, I have a parent page, say, Poetry and I publish a child page with American poets' stuff, then in the future I may want to divide poetry into American, European etc. poetry pages. Then I have to move my page to the American poetry which is child of the Poetry. In this case my url will be different. I want to avoid it.   

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...