Jump to content

Recommended Posts

Posted

Is there any way with PW to do environment-specific robots.txt, i.e. to block robots from staging sites without having to manually edit files in different environments?

  • Like 1
Posted

Here's how you might dynamically create it with ProcessWire without tinkering with .htaccess files.

  1. Create a new template, call it robots, and set its URLs > Should page URLs end with a slash setting to no, and Files > Content-Type to text/plain. You should tick disable Append file and Prepend file options as well.
    Optionally set its Family > May this page have children to no, and Family > Can this template be used for new pages to one. Family > Optionally Set allowed templates for parents to home only.
  2. Create a new page under homepage, set its template to robots, and name as robots.txt.
  3. Create a new template file at /site/templates/robots.php, inside it you type
<?php namespace Processwire;
// render different robots.txt depending on your own conditions.
if ($config->debug) {
	// use PHP_EOL to create multiline strings
	echo <<<PHP_EOL
User-agent: *
Disallow: /
PHP_EOL;
  
} else {
  
	echo <<<PHP_EOL
User-agent: *
Disallow: 
PHP_EOL;

}

and done. You should be able to see robots.txt at the url /robots.txt.

  • Like 10
  • Thanks 1
Posted

Thanks guys! Sorry for the late reply, didn't get any notifications of replies. Going to give both methods a try.

  • 1 month later...
Posted

I am trying to implement the above method from abdus.

All works fine when i have a title of robots

But as soon as I name the page robots.txt I get the following error:

"The requested file robots.txt was not found."

So i tried robots.doc and that work perfectly well. Their must be something preventing me to use the extension .txt anyone any ideas?

Posted
Quote

But as soon as I name the page robots.txt I get the following error:

"The requested file robots.txt was not found."

Same for me ??? 

Thought maybe it was a $config setting but couldn't find anything.

Suggestions?

Posted

Solved!!! Answer was in the .htaccess file.

Remove reference to robots.txt being a physical file on the system.

#RewriteCond %{REQUEST_FILENAME} !(favicon\.ico|robots\.txt)
  RewriteCond %{REQUEST_FILENAME} !(favicon\.ico)

 

  • Like 2
  • Thanks 1
  • 1 year later...
Posted

Awesome solution just what I was looking for!

For a little more flexibility, instead of hardcoding to allow robots if debug is false, I added a checkbox field on the robots template so I can turn SEO blocking on or off from the backend.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...