Jump to content
Tyssen

Environment-specific robots.txt

Recommended Posts

Is there any way with PW to do environment-specific robots.txt, i.e. to block robots from staging sites without having to manually edit files in different environments?

Share this post


Link to post
Share on other sites

Here's how you might dynamically create it with ProcessWire without tinkering with .htaccess files.

  1. Create a new template, call it robots, and set its URLs > Should page URLs end with a slash setting to no, and Files > Content-Type to text/plain. You should tick disable Append file and Prepend file options as well.
    Optionally set its Family > May this page have children to no, and Family > Can this template be used for new pages to one. Family > Optionally Set allowed templates for parents to home only.
  2. Create a new page under homepage, set its template to robots, and name as robots.txt.
  3. Create a new template file at /site/templates/robots.php, inside it you type
<?php namespace Processwire;
// render different robots.txt depending on your own conditions.
if ($config->debug) {
	// use PHP_EOL to create multiline strings
	echo <<<PHP_EOL
User-agent: *
Disallow: /
PHP_EOL;
  
} else {
  
	echo <<<PHP_EOL
User-agent: *
Disallow: 
PHP_EOL;

}

and done. You should be able to see robots.txt at the url /robots.txt.

  • Like 10

Share this post


Link to post
Share on other sites

Thanks guys! Sorry for the late reply, didn't get any notifications of replies. Going to give both methods a try.

Share this post


Link to post
Share on other sites

Hi @Tyssen

I think that @abdus method is more natural for PW, in the same manner, you can implement sitemap.

Share this post


Link to post
Share on other sites

I am trying to implement the above method from abdus.

All works fine when i have a title of robots

But as soon as I name the page robots.txt I get the following error:

"The requested file robots.txt was not found."

So i tried robots.doc and that work perfectly well. Their must be something preventing me to use the extension .txt anyone any ideas?

Share this post


Link to post
Share on other sites
Quote

But as soon as I name the page robots.txt I get the following error:

"The requested file robots.txt was not found."

Same for me ??? 

Thought maybe it was a $config setting but couldn't find anything.

Suggestions?

Share this post


Link to post
Share on other sites

Solved!!! Answer was in the .htaccess file.

Remove reference to robots.txt being a physical file on the system.

#RewriteCond %{REQUEST_FILENAME} !(favicon\.ico|robots\.txt)
  RewriteCond %{REQUEST_FILENAME} !(favicon\.ico)

 

  • Like 2

Share this post


Link to post
Share on other sites

Awesome solution just what I was looking for!

For a little more flexibility, instead of hardcoding to allow robots if debug is false, I added a checkbox field on the robots template so I can turn SEO blocking on or off from the backend.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By daniel_puehringer
      Hey,

      so we all know about SEO and the importance of performance. Basically we do it, because if no one finds the website we just built, it´s frustrating. We all try to write clean markup, css and js code and most might have a webpack/gulp/whatever pipeline to minimize css&js.
      But when thinking about it, optimizing your pipeline might save you a few (hundreds) of kb, compared to loading an image with 1 mb that´s literally nothing and frankly just ridiculous.

      Don´t get me wrong, frontend pipelines are great and should be used, but let´s shift your "I will optimize the shit out of that 3 css lines" focus to something different: try to serve images as fast as possible, this is where the performance boost really happens.

      I´m no pro on processwire so far, but I built a very easy to use picture element, which some of you could find interesting:

      1. the picture comes with 3 different sizes: one for mobile (keep in mind the double dpi, therefore width of 828px), one for tablet and one for desktop
      2. the picture generates a webp version and the original file extension as a fallback
      3. the filesize of each element is rendered within the "data" attribute
      4. lazy loading(sooo important!!!) is done via the native 'loading="lazy"' attribute.


      Please try it out and see the difference 🙂

      I posted this so others can easily optimize their images, but I would also like to hear your suggestions in making it better. Maybe you could decrease the rendering time or maybe you have some easy improvements.

      Please let me know.

      Greetings from Austria!


       
      <picture> <source data="<?php echo($oElement->repeater_image->width(828)->webp->filesize);?>" media="(max-width: 414px)" srcset="<?php echo($oElement->repeater_image->width(828)->webp->url) ?> 2x" type="image/webp"> <source data="<?php echo($oElement->repeater_image->width(828)->filesize) ?>" media="(max-width: 414px)" srcset="<?php echo($oElement->repeater_image->width(828)->url) ?> 2x" type="image/<?php echo($oElement->repeater_image->ext)?>"> <source data="<?php echo($oElement->repeater_image->width(767)->webp->filesize) ?>" media="(min-width: 415) and (max-width: 767px)" srcset="<?php echo($oElement->repeater_image->width(767)->webp->url) ?> 2x" type="image/webp"> <source data="<?php echo($oElement->repeater_image->width(767)->filesize) ?>" media="(min-width: 415) and (max-width: 767px)" srcset="<?php echo($oElement->repeater_image->width(767)->url) ?> 2x" type="image/<?php echo($oElement->repeater_image->ext)?>"> <source data="<?php echo($oElement->repeater_image->webp->filesize) ?>" media="(min-width: 768px)" srcset="<?php echo($oElement->repeater_image->webp->url) ?>" type="image/webp"> <source data="<?php echo($oElement->repeater_image->filesize) ?>" media="(min-width: 768px)" srcset="<?php echo($oElement->repeater_image->url) ?>" type="image/<?php echo($oElement->repeater_image->ext)?>"> <img data="<?php echo($oElement->repeater_image->filesize) ?>" class="img-fluid" loading="lazy" src="<?php echo($oElement->repeater_image->url) ?>" alt="<?php echo($oElement->repeater_image->description) ?>" type="image/<?php echo($oElement->repeater_image->ext)?>"> </picture>
    • By neophron
      Hi guys,
      after getting a complain message from google about a robots.txt (where everything is ok), I searched for an online tool, where I can test my robots.txt files. I found this website: https://technicalseo.com/tools/
      This page offers a bunch of nice tools, just wanted to share it with you.

    • By Mike Rockett
      Docs & Download: rockettpw/seo/markup-sitemap
      Modules Directory: MarkupSitemap
      Composer: rockett/sitemap
      MarkupSitemap is essentially an upgrade to MarkupSitemapXML by Pete. It adds multi-language support using the built-in LanguageSupportPageNames. Where multi-language pages are available, they are added to the sitemap by means of an alternate link in that page's <url>. Support for listing images in the sitemap on a page-by-page basis and using a sitemap stylesheet are also added.
      Example when using the built-in multi-language profile:
      <url> <loc>http://domain.local/about/</loc> <lastmod>2017-08-27T16:16:32+02:00</lastmod> <xhtml:link rel="alternate" hreflang="en" href="http://domain.local/en/about/"/> <xhtml:link rel="alternate" hreflang="de" href="http://domain.local/de/uber/"/> <xhtml:link rel="alternate" hreflang="fi" href="http://domain.local/fi/tietoja/"/> </url> It also uses a locally maintained fork of a sitemap package by Matthew Davies that assists in automating the process.
      The doesn't use the same sitemap_ignore field available in MarkupSitemapXML. Rather, it renders sitemap options fields in a Page's Settings tab. One of the fields is for excluding a Page from the sitemap, and another is for excluding its children. You can assign which templates get these config fields in the module's configuration (much like you would with MarkupSEO).
      Note that the two exclusion options are mutually exclusive at this point as there may be cases where you don't want to show a parent page, but only its children. Whilst unorthodox, I'm leaving the flexibility there. (The home page cannot be excluded from the sitemap, so the applicable exclusion fields won't be available there.)
      As of December 2017, you can also exclude templates from sitemap access altogether, whilst retaining their settings if previously configured.
      Sitemap also allows you to include images for each page at the template level, and you can disable image output at the page level.
      The module allows you to set the priority on a per-page basis (it's optional and will not be included if not set).
      Lastly, a stylesheet option has also been added. You can use the default one (enabled by default), or set your own.
      Note that if the module is uninstalled, any saved data on a per-page basis is removed. The same thing happens for a specific page when it is deleted after having been trashed.
          
    • By NehaPillai
      Hello Everyone, I was trying to update SEO meta title, description and meta keywords for my website in Process Wire CMS but it saving in the backend but it is not reflecting on my website, Please help me regarding this error. Please find below attached screen shot for your ref. TIA.


    • By franciccio-ITALIANO
      Hi, we can choose the "headline" and "title" and "summery" in panel page of processwire, but we can't write the "metadecriptions" and "tags".
       I can write mdescropt and tags in templates, but I've same templates for many articles... so, how I can change mdescription and tags?

      Thanks...
×
×
  • Create New...