Frank Vèssia Posted January 23, 2011 Share Posted January 23, 2011 Hello, first i want to say thanks for this great CMS. Now the question: it's possible to change the url structure for example from /page-name/ to /page-name.html ? And for subpages /main-page/subpage.html Thanks Link to comment Share on other sites More sharing options...
Adam Kiss Posted January 23, 2011 Share Posted January 23, 2011 Hello, welcome to forums. I'll let Ryan answer this one, but I'm just curious: do you have any reason why you would like to accomplish this, or is it simply matter of personal preference? Adam Link to comment Share on other sites More sharing options...
Frank Vèssia Posted January 23, 2011 Author Share Posted January 23, 2011 The reason is for SEO optimization. It's just a little improvement but i always use this kind of url structure. If it's not too complicated it could be fine. Link to comment Share on other sites More sharing options...
Adam Kiss Posted January 23, 2011 Share Posted January 23, 2011 Sidenote: I don't think SE do care about your URL schema, it's more things like keyword density, inbound/ooutbound links, etc. But Ryan (the creator) should be here in around two hours, so he'll answer your question Link to comment Share on other sites More sharing options...
Frank Vèssia Posted January 23, 2011 Author Share Posted January 23, 2011 As i said it's a little improvement, like a "fine tuning". P.S.: keywords are no more considered by google. Link to comment Share on other sites More sharing options...
Adam Kiss Posted January 23, 2011 Share Posted January 23, 2011 Could you try this? 1., create dedicated function, which does something like this: function URLize($url){ $new_url = substr($url, 0, length($url-1)); $new_url .= '.html'; return $new_url; } then in your templates: echo URLize($that_page->url); 2., edit .htaccess' line 43 to something (and I'm not sure it's 100% correct) like this: RewriteRule ^(.*).html$ index.php?it=$1 [L,QSA] And I'm like 80% sure there is currently no better solution. Link to comment Share on other sites More sharing options...
apeisa Posted January 23, 2011 Share Posted January 23, 2011 According to this short Q&A from Stack Overflow http://stackoverflow.com/questions/4558663/add-html-when-rewriting-url-in-htaccess there might be some very little "bonus" in some scenarios when adding .html. On the other hand: shorter urls are always better and urls without .file are also much cleaner for humans. Link to comment Share on other sites More sharing options...
ryan Posted January 23, 2011 Share Posted January 23, 2011 ProcessWire will support .html extensions. You just have to make them your page names, i.e. "about.html" rather than "about", or look for them in urlSegments. I've actually done this before, though for specific pages (like /sitemap.xml), not on a site-wide basis. But I don't see any problem with it conceptually. There is one implementation problem in that PW2 enforces slashes at the end of the URL. However, I think I should be able to make that part optional as part of each template's advanced configuration. Actually, I'd meant to make that optional even before this question came up. So let me work on that part, and I think this will be an easy addition. As for the value of keywords, I think you guys were talking about two different things. I think Sevarf2 is talking about meta keywords. Meta keywords have never been used by Google, and they were thrown out of most other major search engines about a decade ago. But meta keywords are still valuable if you are running your own spider, perhaps something indexing across multiple platforms on a very large site. But excluding that, there's not much point in using the meta keywords tag. I think it's better to leave them out-- A spammy meta keywords tag can still hurt you, even though a meta keywords tag can't help you. I think Adam was referring to keyword density, like the fact that key words/phrases have to be present on a page (or links to it) in order for it to be matched by Google. And the strength of those keywords can relate to where they are placed in the markup (i.e. <title>, anchor text and headline's carry more weight, for starters). And then there are some formulas about density, but they are a matter fine tuning as well. Roughly 80% of SEO is what happens on other sites, not yours (i.e. who's linking to you), so I usually tell people to just focus on making the site highly accessible with high quality semantic markup and content, and focus on making it something that people would want to link to, and leave it at that. My experience is that the best results come from logical URL structures that are highly readable and contain keywords. I don't think that Google actually ranks based on keywords in your path, but they certainly highlight them in the SERPS, which is worth it right there. I would be surprised if there was any benefit to using ".html" in your URLs on a new site (as opposed to an old site). At one time, I think there was a benefit, just like there was a benefit to using subdomains over paths. Such benefits don't last because they just exploit a temporary vulnerability in Google's algorithm. Ending with ".html" is kind of a legacy thing, present on sites that have been around a long time. Benefits from ".html" might be a side effect of the age of sites where they are used, when in fact it's the age that drives the result. Where I think the real benefit would lie is if you are trying to maintain old URLs that end with ".html". If you've had a page at /page.html for a really long time and it's well indexed in Google, it's to your benefit not to change it. While technically Google should transfer page rank to your new page (via a 301 redirect), it doesn't always work and/or throws you in the sandbox for a little while. Other times, it works as planned, but it's a risk on a high traffic page. So if you are trying to maintain legacy URLs that end with ".html" there is most certainly an SEO benefit to keeping them at ".html". But on a new site, I think I would avoid using ".html" in your URLs. I don't see any real world evidence that ".html" influences performance, and I don't view dust crawling as being applicable for the type of pages we are talking about. But lets just assume that there was some benefit. Using .html in URLs that aren't actually composed of static files that end with ".html", with the intention of swaying Google, would likely cross the line on their webmaster guidelines (short term benefits lead to long term pain). I would avoid .html in your URLs unless it's literally a static page where the file extension drives the mime type. Otherwise, it's trying to deceive Google a little bit, and that's never a good long term strategy. All of this is speculation of course, I don't know anyone working at Google, but I do enjoy the subject! If you know something more about this, please keep the conversation going. Link to comment Share on other sites More sharing options...
Adam Kiss Posted January 23, 2011 Share Posted January 23, 2011 I've been once again defeated by the simplicity of PW! @Ryan, is it possible somehow (now or in future) to remove the trailing slash? @OP: Also, there should be possible to add hook while saving page, so it would automatically add '.html' extension to your page name (/url/slug), if we can solve the trailing slash question. Link to comment Share on other sites More sharing options...
ryan Posted January 23, 2011 Share Posted January 23, 2011 Adam: I'm going to make the trailing slash configurable on a per-template basis. It'll be "on", "off" or "either", with the default being "on". Link to comment Share on other sites More sharing options...
Adam Kiss Posted January 23, 2011 Share Posted January 23, 2011 Ryan: I don't think 'either' it's a good idea! Actually, system should have only on/off settings and if something, then have a 301 redirect to the active setting from both. e.g. you have trailing slashes off, so /page/subpage works, but /page/subpage/ does redirect to aforementioned This is especially important – to not have duplicated content! Link to comment Share on other sites More sharing options...
ryan Posted January 23, 2011 Share Posted January 23, 2011 You are right about that, there should not be an either option, not sure what I was thinking. Link to comment Share on other sites More sharing options...
martinluff Posted January 24, 2011 Share Posted January 24, 2011 A while back I saw some independent tests that showed keywords in URLs definitely counted in some SE results (Google and Yahoo included). Plus if you read some of the comments from Google themselves then seems to suggest there's some benefit: From Sitepoint: "What Is the URL structure preferred by Google? Google’s Matt Cuts replied: I would recommend long-haired-dogs.html long_haired_dogs.html longhaireddogs.html in that order. If your site is already live on the web, it’s probably not worth going back to change from one method to another, but if you’re just starting a new site, I’d probably choose the URLs in that order of preference. I can only speak for Google; you’ll need to run your own tests to see what works best with Microsoft, Yahoo, and Ask." However, I think Ryan's comments are spot-on regarding the file extension part; i.e. has no effect other than pages already ranked by Google from an old site which include specific file extension Rgds M Link to comment Share on other sites More sharing options...
deandre212 Posted February 1, 2011 Share Posted February 1, 2011 Is there a way to remove trailing slash with rewrite in .htaccess? Link to comment Share on other sites More sharing options...
Adam Kiss Posted February 1, 2011 Share Posted February 1, 2011 As far as I know, there isn't [for now at least] but I think it will be there, at least soon, since you're not the first to ask [and frankly, I like it without trailing slashes better too] Link to comment Share on other sites More sharing options...
ryan Posted February 2, 2011 Share Posted February 2, 2011 I made the slash configurable by template. If you download the latest commit, you'll see it as a new advanced setting for each template. Adam, I also made the page number prefix configurable now with $config->pageNumUrlPrefix = 'your_prefix'; If not specified, then it defaults to 'page', as before, i.e. 'page1', 'page2', 'page3', ... Link to comment Share on other sites More sharing options...
Adam Kiss Posted February 2, 2011 Share Posted February 2, 2011 Great stuff! But if I may add something, template setting only is bit redundant here isn't it? Is there site-wide setting too? So you set it once and set something different only if you overload wite-wide setting? Link to comment Share on other sites More sharing options...
Frank Vèssia Posted February 2, 2011 Author Share Posted February 2, 2011 I made the slash configurable by template. If you download the latest commit, you'll see it as a new advanced setting for each template. Adam, I also made the page number prefix configurable now with $config->pageNumUrlPrefix = 'your_prefix'; If not specified, then it defaults to 'page', as before, i.e. 'page1', 'page2', 'page3', ... This is good..but it could be better when i choose "No" for this setting to add a custom end like .html o whatelse instead of a / ;D Link to comment Share on other sites More sharing options...
Adam Kiss Posted February 2, 2011 Share Posted February 2, 2011 Sevarf2: This is just a start, I believe different endings will come too. Link to comment Share on other sites More sharing options...
ryan Posted February 2, 2011 Share Posted February 2, 2011 I will look at adding that option. Though this definitely falls into the court of being something I wouldn't ever use on my own sites, and I would question the value of doing it. If it's for maintaining legacy URLs, you are better off using Apache to 301 redirect them away from the legacy URLs. Also, you can always make pages end with .html by making that the page name, i.e. "mypage.html" rather than "mypage", and turning off trailing slashes. Link to comment Share on other sites More sharing options...
ryan Posted February 2, 2011 Share Posted February 2, 2011 Adam, the page number prefix is site-wide, not template. The slash setting is by template. The default state is for it to enforce slashes, as before. Nothing has changed unless you go into a template and specifically set it to not enforce the slashes. Link to comment Share on other sites More sharing options...
Adam Kiss Posted February 2, 2011 Share Posted February 2, 2011 Ryan: I know, I saw the commit [already patched my fork] I just think that the slash/noslash is matter of personal preference – I actually feels like pages shouldn't have slashes [that's highly subjective] However, people often have these things – and if it's quick hack for you [e.g. one text field and something], why not do it that way, so even heavily biased people want to use PW? I remember, that when I started doing websites, I preferred .htm over .html. Then I preferred .php over .phtml or .php3. Everyone has these little preferences, nobody is saying that either slashes or noslashes question has some huge SEO impact. Link to comment Share on other sites More sharing options...
Frank Vèssia Posted February 2, 2011 Author Share Posted February 2, 2011 I will look at adding that option. Though this definitely falls into the court of being something I wouldn't ever use on my own sites, and I would question the value of doing it. If it's for maintaining legacy URLs, you are better off using Apache to 301 redirect them away from the legacy URLs. Also, you can always make pages end with .html by making that the page name, i.e. "mypage.html" rather than "mypage", and turning off trailing slashes. Good solution adding .html in the page name... Link to comment Share on other sites More sharing options...
ryan Posted February 2, 2011 Share Posted February 2, 2011 I do want to make sure people have the flexibility to do it any way they want, so I think that's what we've got now (they can set it according to their preference). I've been meaning to add this slashes setting, so figured now was the time with this most recent request. My preference for the slashes is because a page can be both a container for data (fields) and a container for pages (children). As a matter of consistency, I want to treat all pages the same (at least on my own sites) so that my site's API code doesn't always have to be looking for the presence of slashes when working with selectors, relative paths, url segments and such. I don't want to have to always consider these things when developing a site. As for adding extensions like ".html", that would kill the ability to use page URLs/paths in selectors, unless you actually named your page with the ".html" extension. So if we start adding automatic extensions, I think we start creating a lot more work for the site developers and general confusion... at least I would find it confusing. Sure there might be solutions around the issues, but if something is going to be used on less than 30% of sites then it doesn't belong in the core (which would make extensions a possible good module idea). Link to comment Share on other sites More sharing options...
Adam Kiss Posted February 2, 2011 Share Posted February 2, 2011 Ryan, now you're saying out loud what I totally believe in: If it's not major thing, don't add it to core! Anyway, I still think that slash setting should be site-wide – I can't see any reason now for it to be template setting, I mean: If you prefer no-slash urls, you prefer it on every page you have, not on some only. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now