Jump to content
apeisa

Release: Redirects

Recommended Posts

This is an edge case, and a nasty side effect of IIS servers allowing spaces in paths. I'm going to document so as to allow people to perhaps gain some knowledge on it, but also to see if it's something applicable to this module, or ProcessWire (although I'm thinking this comes down to Apache more than anything else).

We've just launched a new site, and our client's SEO guys are adding redirects from the old site.

The old site was on IIS servers, whereas it's now on a Unix host running Apache. One of the redirects is a pdf, residing in a folder that contains a space.

This is what I've attempted:

  • visit the pdf directly (bare in mind, this pdf no longer exists)
    • expected:
      404 on request
    • result:
      home page shown
      pdf 404's
      URL is displayed as example.com/full/path%20with%20spaces/somepdf.pdf
  • visit example.com/full/path%20with%20spaces/
    • expected:
      404 on request
    • result:
      display home page
      full URL is retained

So this leaves me with the problem that ProcessRedirect first checks if a URL returns a 404, and doesn't get one, and thus does not perform a redirect.

I doubt this is ProcessWire's doing, in which case the issue must be with how Apache is handling the request.

Solution: add urls containing spaces directly to .htaccess with a Rewrite rule (blergh!).

I hope this helps someone who may think it's the module's fault (and I hope I'm wrong, and that someone may have a better solution for me!).

Share this post


Link to post
Share on other sites

Are you sure it does not work as expected? For example, the url: http://processwire.com/skyscrapers/path%20with%20space also just shows to the homepage (processwire.com) content with the requested url still shown in the adress bar. If you check the response header however, it does correctly set 'Status: HTTP/1.1 404 Not Found '. It just a matter of how the 404 page id is configured i think.

Share this post


Link to post
Share on other sites

You're correct, I didn't check the response header for the page without the file, so it is a 404 - baffling!

It definitely isn't working with ProcessRedirect - as soon as I remove the percent encoding I get the 404 page I expect, and that URL can be used as the from address.

Share this post


Link to post
Share on other sites

2. If I edit any exiting redirect I get this error, but the redirect does work.

redirect_from: Error found - please check that it is a valid URL (redirect_from)

I get the same error message. The funny thing is that the redirect actually works, it is stored despite of the error message etc. It just emits this error message. (If this helps with debugging, I'm redirecting from /frameset.htm to / – very, very old site about to be relaunched.)

I have used the Redirects module before, but I don't think I ever noticed this – maybe it's related to 2.4? When installing the module, I get a message saying it might not be compatible with PW > 2.2 …

Share this post


Link to post
Share on other sites

Thanks, I committed that change to github also.

  • Like 1

Share this post


Link to post
Share on other sites

Hello, 

I upload this module on my server and "check new module" and install it... but it doesn't appear in SET UP and show "This page may not be deleted" why I tried to uninstall it.  :'(

Any idea to resolve this ?

Thanks you in advance.

Nyo.

Share this post


Link to post
Share on other sites

Just dreaming:

Would't it be nice if there was a way to catch 404 errors and list them each for applying a redirect with this module?

This would spare some time studying PIWIK pages for 404 errors and copying links over to PW, right?

Share this post


Link to post
Share on other sites

Just dreaming:

Would't it be nice if there was a way to catch 404 errors and list them each for applying a redirect with this module?

This would spare some time studying PIWIK pages for 404 errors and copying links over to PW, right?

Might be unlikely, but what if a bot decided to hit 1000's of random pages on your site looking maybe for admin access or something - then your redirects database table would be full of all these entries - could get messy!

  • Like 3

Share this post


Link to post
Share on other sites

strong point, Adrian !!!!

Yes the hack script kiddies, I forgot. Your are 100% right with your concern.

If I would build such a script I definitely would need to use a filter, like

- show only links with 5 hits and more (links with less hits would neither bother Google nor me, I think, and script kiddies do not repeat their urls)

- delete link after 1 week if there were less than 5 hits and no target set yet by the admin

And I would have a different table for those "candidates"...  and if a target is set, the link moves to the redirects list table.

This would avoid some of the mess i.m.h.o.

Actually I would use such an option only some weeks after a site relaunch and then switch it off.

(Most of the important 404 errors show up within the first days after a relaunch.)

Showing when the last hit was, would be also great, so it is easy to decide which links are candidates for deletion.

This would also help to have the list clean and up-to-date.

Why I am writing this at all:

Yes I know how to collect 404 errors and throw correct redirects into the htaccess file for a while.

But I would love to delegate this all so one of the editors, since this is can be a lot of work if there are many links.

Share this post


Link to post
Share on other sites

I like the idea ceberlin. Catch, log and redirect 404 would make a lovely new module, but I don't see it as a feature for this one.

I have heard about a fellow called Adrian, who creates about 5 modules in a week...

  • Like 3

Share this post


Link to post
Share on other sites

I have heard about a fellow called Adrian, who creates about 5 modules in a week...

He's probably creating one right now as we speak.... ;)  :P  :)

  • Like 1

Share this post


Link to post
Share on other sites

Well I was actually creating a module as you spoke, but not for this :)

Sounds like a cool idea, but I really have to focus hard on some real work for a while, so it might not be me that takes this on I'm afraid, as much as I'd love to :)

  • Like 1

Share this post


Link to post
Share on other sites

Hi guys :)

I may as well spill the beans now... I'm actually working on an extended version of the Redirects module. It's intention is to be more feature rich.

That said, I could look into including a 404 looker-upper. Only problem is, I'm not quite sure how to implement it the way @ceberlin described. However, I shall give it a try.

First, let me get the basis of the module up and running. All I've done so far is the front-end page, the page to create a new redirect, and saving. Still a lot to do, in terms of editing, processing, importing, exporting, etc.

(Edit: Oh, and @apeisa, don't worry - you shall be credited in the module :) )

  • Like 2

Share this post


Link to post
Share on other sites

Mike Anthony,

How is your extended version coming?  Will it support wildcard options?  /somepage/* 

I need wildcards, so I'm thinking about implementing it here, but no need to duplicate the effort. :)

Share this post


Link to post
Share on other sites

Reno, to be honest, I have hard time finding time extending this. So if you tackle wildcards etc without complicating the basic usage, feel free to hijack ownership of this module.

And now I realize it was Mike doing the promises... Same goes for both of you. Don't count on me :)

  • Like 1

Share this post


Link to post
Share on other sites

Antti, Dammit man! Talk to Ryan about how to clone yourself. :P

Share this post


Link to post
Share on other sites

I have figured that out! But it takes 9 months, and what you get is something small (but cute!) that take years and years to evolve into coder.

  • Like 2

Share this post


Link to post
Share on other sites

Hahahahaha!  I figured out that method as well. You're right — too slow.

  • Like 1

Share this post


Link to post
Share on other sites

Hey guys - so sorry, things have been quite hectic on my side of the world. As such, no real time to finish up at the moment. I'll give it a bash for some time each day for the next few days and see how far I get.

I'm really just porting my redirector extension for Bolt into a PW module that can be modified from the back-end. You can learn more about the Bolt extension here.

Reno, you'll probably be using something like this with the new extension: /somepage/{name:all}, or even just /somepage/{path} (a smart wildcard)

Share this post


Link to post
Share on other sites

Mike Anthony,

Sweet! That looks super cool. A bit more than I need, and longer than I can wait. :)

I extended the module this afternoon and added support for:

The alternate domain is specified in the module config settings. If set, the module will also check the alternate domain for the request.

If it gets back a status code of 200, then it will redirect to the request on that domain. 

That may be a fringe feature, but I need to maintain a legacy.site.com scenario, and this takes care of everything.

@apeisa

I may not be the best keeper of this module, as this kind of thing is a bit out of my wheelhouse. Can I just submit a Pull Request?

I could also just release a beta version first for those who want to test it. We have about 80 redirects active here (a lot of which will go away with the wildcards), but I've been testing it on a production site all afternoon. Seems stable. 

  • Like 4

Share this post


Link to post
Share on other sites

Reno:

Indeed - and I believe it's the only module of it's kind.

I'll still be releasing the module, as I'm sure there are those who'd like to use it.

Also, I find what you're wanting to do with your site quite interesting. I have a question: why would you want to maintain the URI structure for the legacy site, instead of performing permanent redirects? If it's something that many developers do, I'd be willing to include the functionality in the new module:

Perhaps I would add a field called "map to domain" where you enter a domain ("legacy.site.com"), and it will check that domain for the page being requested. Then, it will (per the choice selected) either load up the page on the new site, or redirect to the legacy site. Sound good?

Edit: Alternatively, and to make things a little simple, I could make the module ask what kind of redirect needs to be created before showing the form. So it would ask if you want to create a "Standard Redirect" or a "Legacy Site Redirect/Map". Yes?

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...