Jump to content
FrancisChung

Google not indexing Category Pages

Recommended Posts

Hi, I have an ongoing issue with Google SEO that I can't seem to fix. Wondering if anyone has come across a similar situation?

We deployed a new version of the website using a new deployment methodology and unfortunately, the wrong robots.txt file was deployed basically telling Googlebot not to scrape the site.

The end result is that if our target keywords are used for a (Google) search, our website is displayed on the search page with "No information is available for this page." 

Google provides a link to fix this situation on the search listing, but so far everything I have tried in it hasn't fixed the situation.
I was wondering if anyone has gone through this scenario and what was the steps to remedy it?
Or perhaps it has worked and I have misunderstood how it works?

The steps I have tried in the Google Webmaster Tool :

  1. Gone through all crawl errors
  2. Restored the Robots.txt file and Verified with Robots.txt tester
  3. Fetch/Fetch and Render as Google as both Desktop/Mobile, using root URL and other URLs, using Indexing Requested / Indexing Requested for URL and Linked Pages.
  4. Uploaded a new Sitemap.xml 

Particularly on the Sitemap page, it says 584 submitted, 94 indexed.

 

Would the Search Engine return "No Information available" because the page is not indexed? The pages I'm searching for are our 2 most popular keywords and entry points into site. It's also one of 2 most popular category pages.  So I'm thinking it probably isn't the case but ...

How can I prove / disprove the category pages are being indexed?

The site in questions is Sprachspielspass.de. The keywords to search are fingerspiele and kindergedichte.

 

Share this post


Link to post
Share on other sites

Just wondering, but how long has this been happening? In somewhat similar situations I've had delays of hours .. days.

Another thing I noticed is that your (current) robots.txt has a max-age of 30 days. According to Google's docs, they may "increase the time" a robots.txt file is cached based on this header, so in theory they could still be holding on to the old version. In that case I don't really know which steps to take, but first I'd wait for a few days to make sure that it isn't just Google's regular delay :)

Share this post


Link to post
Share on other sites

Hi @teppo, it's been going on for a few weeks now. Very frustrating.
It's definitely beyond a wait and see approach stage now.

Is the robots.txt max-age defined in .htaccess? I don't recall creating such setting for robots.txt specifically, but I do recall such setting for the website in general for cache-headers.

 

Share this post


Link to post
Share on other sites

I run the robots.txt file through this
http://www.webconfs.com/http-header-check.php

It says :
Cache-Control => max-age=2592000
Expires => Wed, 21 Feb 2018 18:21:44 GMT

Which is a bit baffling.

The Homepage and Main Category pages are showing Cache-Control : No Cache at the moment.
I do have Procache installed but currently, it's off at the moment. 

This is our current Cache-Control settings.

 

<ifModule mod_headers.c>
  # Ignore comments, everything is set to 8 days atm.
  # 1 week for fonts
  <filesMatch "\.(ico|jpe?g|jpg|png|gif|swf)$">
    Header set Cache-Control "max-age=691200, public"
  </filesMatch>
  <filesMatch "\.(eot|svg|ttf|woff|woff2)$">
    Header set Cache-Control "max-age=691200, public"
  </filesMatch>
  # 1 day for css / js files
  <filesMatch "\.(css)$">
    Header set Cache-Control "max-age=691200, public"
  </filesMatch>
  <filesMatch "\.(js)$">
    Header set Cache-Control "max-age=691200, private"
  </filesMatch>
  <filesMatch "\.(x?html?|php)$">
    Header set Cache-Control "private, must-revalidate"
  </filesMatch>
</ifModule>

I will add a separate filesMatch module for Robots.txt and see what happens ...
 


  <filesMatch "^robots.(txt|php)$">
    Header Set Cache-Control "max-age=0, public"
  </filesMatch>

 

Share this post


Link to post
Share on other sites

I guess you're committing major SEO sins here... You deliver the exact same content under several URLs / domains. Google is not amused by such practises.

e.g.

http://sprachspielspass.de/kinderlieder/alle-kinderlieder/ri-ra-rutsch/

http://kinder-reime.com/kinderlieder/alle-kinderlieder/ri-ra-rutsch/

http://finger-spiele.com/kinderlieder/alle-kinderlieder/ri-ra-rutsch/

This is called "black hat SEO", and frowned upon. I only discovered these other domains by checking your HTTP headers, where it says

access-control-allow-origin: kinder-reime.com, finger-spiele.com, sprachspielspass.de

 

  • Thanks 1

Share this post


Link to post
Share on other sites
21 hours ago, dragan said:

I guess you're committing major SEO sins here... You deliver the exact same content under several URLs / domains. Google is not amused by such practises.

e.g.

http://sprachspielspass.de/kinderlieder/alle-kinderlieder/ri-ra-rutsch/

http://kinder-reime.com/kinderlieder/alle-kinderlieder/ri-ra-rutsch/

http://finger-spiele.com/kinderlieder/alle-kinderlieder/ri-ra-rutsch/

This is called "black hat SEO", and frowned upon. I only discovered these other domains by checking your HTTP headers, where it says


access-control-allow-origin: kinder-reime.com, finger-spiele.com, sprachspielspass.de

 

@Dragan, the other two domains are our test server / test domains. In theory, Google or any other crawlers shouldn't be indexing them because robots.txt for those other sites instructs it not to. Surely this isn't a frowned upon practice?

In fact, the reason why I'm in such a pickle is because I migrated the code + contents from our UAT site to our live site including the robots.txt file from the UAT. 

I'm aware of what Google does to punish sites that circumvent around Seach Engine rules and I thought it's pretty hard to circumvent them these days. And I assure you that's not our intention here to cannibalise our own SEO rankings.

Perhaps I need to fix the access-control-allow-origin to have only one domain? I wasn't sure listing all our domains would have a negative impact? 

Share this post


Link to post
Share on other sites

Hi,

Also, you don't have the non-www version redirected to the www version, or vice versa.

And you should normally only have one meta name="canonical" version in your source code. 
Currently, it changes depending on the non-www or www version.

  • Like 1

Share this post


Link to post
Share on other sites

I've managed to get Google display our meta description correctly again today in their search results.

Not sure which of the following corrected it, but the steps are took were :

1) Not to cache Robots.txt (.htaccess) 

  <filesMatch "^robots.(txt|php)$">
    Header Set Cache-Control "max-age=0, public"
  </filesMatch>

2) Removed other test sites from access-control-allow-origin
3) Created a sitemap-category.xml file where it listed the homepage plus important category pages and uploaded it to Google.
4) Uploaded sitemap.xml to Google again after fixing 1) & 2)

I initially tried 3) but it didn't fix the problem initially (Google initially indexed 0 pages ... then 1 page today ).

I tried 2) 3) 4) today and it seems to have fixed it. 
Process 1) I tried when I first posted this thread.

It's also quite possible it "organically" fixed itself today by coincidence and none of my steps was actually effective.

As for the canonical meta-name issues, I'm using MarkupSEO module so it possibly is an issue with that.
I'll post any updates here.

Thanks again to @dragan@Christophe & @teppo for responding. 

  • Like 1

Share this post


Link to post
Share on other sites

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By Sanyaissues
      I hadn't developed a website for a while, but here we are. It's a very simple minimalist website to showcase the latest work of Dominican Artist Patricia Abreu Mota.
      Site: https://patriabreu.com
      Modules:
      Procache ❤️ Seo Maestro Profiler Pro Lister Pro ProcessRedirects
    • By Mike Rockett
      Docs & Download: rockettpw/seo/markup-sitemap
      Modules Directory: MarkupSitemap
      Composer: rockett/sitemap
      ⚠️ NEW MAINTAINER NEEDED: Sitemap is in need of developer to take over the project. There are a few minor issues with it, but for the most part, most scenarios, it works, and it works well. However, I'm unable to commit to further development, and would appreciate it if someone could take it over. If you're interested, please send me a private message and we can take it from there.
      MarkupSitemap is essentially an upgrade to MarkupSitemapXML by Pete. It adds multi-language support using the built-in LanguageSupportPageNames. Where multi-language pages are available, they are added to the sitemap by means of an alternate link in that page's <url>. Support for listing images in the sitemap on a page-by-page basis and using a sitemap stylesheet are also added.
      Example when using the built-in multi-language profile:
      <url> <loc>http://domain.local/about/</loc> <lastmod>2017-08-27T16:16:32+02:00</lastmod> <xhtml:link rel="alternate" hreflang="en" href="http://domain.local/en/about/"/> <xhtml:link rel="alternate" hreflang="de" href="http://domain.local/de/uber/"/> <xhtml:link rel="alternate" hreflang="fi" href="http://domain.local/fi/tietoja/"/> </url> It also uses a locally maintained fork of a sitemap package by Matthew Davies that assists in automating the process.
      The doesn't use the same sitemap_ignore field available in MarkupSitemapXML. Rather, it renders sitemap options fields in a Page's Settings tab. One of the fields is for excluding a Page from the sitemap, and another is for excluding its children. You can assign which templates get these config fields in the module's configuration (much like you would with MarkupSEO).
      Note that the two exclusion options are mutually exclusive at this point as there may be cases where you don't want to show a parent page, but only its children. Whilst unorthodox, I'm leaving the flexibility there. (The home page cannot be excluded from the sitemap, so the applicable exclusion fields won't be available there.)
      As of December 2017, you can also exclude templates from sitemap access altogether, whilst retaining their settings if previously configured.
      Sitemap also allows you to include images for each page at the template level, and you can disable image output at the page level.
      The module allows you to set the priority on a per-page basis (it's optional and will not be included if not set).
      Lastly, a stylesheet option has also been added. You can use the default one (enabled by default), or set your own.
      Note that if the module is uninstalled, any saved data on a per-page basis is removed. The same thing happens for a specific page when it is deleted after having been trashed.
          
    • By jploch
      Hey folks,
      currently Iam working on a website for one of my clients and I need some advice on how to approach this in PW.
      The website is for a company, that offers holiday houses in two locations. 

      The client wants the homepage to show the first location. Normally I just have a home template for the first page, but here the URL should reflect that you are in Location 1. So when you visit the URL casamani.com it should redirect to casamani.com/location-1. Not sure if this makes sense at all.

      Whould it be bad for SEO and performance reasons to redirect home to the Location-1 page?
      Another approach would be to render the Location-1 template on the home template or do an include like discussed here.
      Here is how the tree looks:
      – Home
      – Location 1 (Homepage)
           – Creation
           – Adventure
           – Sustainability
      – Location 2 
           – Creation
           – Adventure
           – Sustainability

      Thanks for looking into this!
    • By daniel_puehringer
      Hey,

      so we all know about SEO and the importance of performance. Basically we do it, because if no one finds the website we just built, it´s frustrating. We all try to write clean markup, css and js code and most might have a webpack/gulp/whatever pipeline to minimize css&js.
      But when thinking about it, optimizing your pipeline might save you a few (hundreds) of kb, compared to loading an image with 1 mb that´s literally nothing and frankly just ridiculous.

      Don´t get me wrong, frontend pipelines are great and should be used, but let´s shift your "I will optimize the shit out of that 3 css lines" focus to something different: try to serve images as fast as possible, this is where the performance boost really happens.

      I´m no pro on processwire so far, but I built a very easy to use picture element, which some of you could find interesting:

      1. the picture comes with 3 different sizes: one for mobile (keep in mind the double dpi, therefore width of 828px), one for tablet and one for desktop
      2. the picture generates a webp version and the original file extension as a fallback
      3. the filesize of each element is rendered within the "data" attribute
      4. lazy loading(sooo important!!!) is done via the native 'loading="lazy"' attribute.


      Please try it out and see the difference 🙂

      I posted this so others can easily optimize their images, but I would also like to hear your suggestions in making it better. Maybe you could decrease the rendering time or maybe you have some easy improvements.

      Please let me know.

      Greetings from Austria!


       
      <picture> <source data="<?php echo($oElement->repeater_image->width(828)->webp->filesize);?>" media="(max-width: 414px)" srcset="<?php echo($oElement->repeater_image->width(828)->webp->url) ?> 2x" type="image/webp"> <source data="<?php echo($oElement->repeater_image->width(828)->filesize) ?>" media="(max-width: 414px)" srcset="<?php echo($oElement->repeater_image->width(828)->url) ?> 2x" type="image/<?php echo($oElement->repeater_image->ext)?>"> <source data="<?php echo($oElement->repeater_image->width(767)->webp->filesize) ?>" media="(min-width: 415) and (max-width: 767px)" srcset="<?php echo($oElement->repeater_image->width(767)->webp->url) ?> 2x" type="image/webp"> <source data="<?php echo($oElement->repeater_image->width(767)->filesize) ?>" media="(min-width: 415) and (max-width: 767px)" srcset="<?php echo($oElement->repeater_image->width(767)->url) ?> 2x" type="image/<?php echo($oElement->repeater_image->ext)?>"> <source data="<?php echo($oElement->repeater_image->webp->filesize) ?>" media="(min-width: 768px)" srcset="<?php echo($oElement->repeater_image->webp->url) ?>" type="image/webp"> <source data="<?php echo($oElement->repeater_image->filesize) ?>" media="(min-width: 768px)" srcset="<?php echo($oElement->repeater_image->url) ?>" type="image/<?php echo($oElement->repeater_image->ext)?>"> <img data="<?php echo($oElement->repeater_image->filesize) ?>" class="img-fluid" loading="lazy" src="<?php echo($oElement->repeater_image->url) ?>" alt="<?php echo($oElement->repeater_image->description) ?>" type="image/<?php echo($oElement->repeater_image->ext)?>"> </picture>
    • By Noel Boss
      » A more exhaustive version of this article is also available on Medium in English and German «
      First, we'd like to thank the very helpful community here for the excellent support.
      In many cases we found guidance or even finished solutions for our problems.
      So a big THANK YOU!!!
       
      We are pleased to introduce you to the new Ladies Lounge 18 website. The next ICF Women’s Conference will take place in Zurich and several satellite locations across Europe. We embarked on bold new directions for the development of the website — in line with the BRAVE theme.

      Ladies Lounge 18 — ICF Woman’s Conference website on Processwire ICF Church is a European Church Movement that started 20 years ago in Zurich and since experienced tremendous growth. There are already well over 60 ICF churches across Europe and Asia. ICF is a non-denominational church with a biblical foundation that was born out of the vision to build a dynamic, tangible church that is right at the heartbeat of time.
      With the growth of the Ladies Lounge from a single-site event to a multi-site event, the demands and challenges to the website have also increased. A simple HTML website no longer cuts it.
      Simplified frontend Our goal with the development of the new site was it to present the different locations — with different languages and partly different content — under once uniform umbrella — while at the same time minimising the administrative effort. In addition to the new bold look and feel, this year’s website is now simpler and easier and the information is accessible with fewer clicks. 
      Some highlights of the new website
      Thanks to processwire, all contents are maintained in one place only, even if they are displayed several times on the website 100% customised data model for content creators Content can be edited directly inline with a double-click:  

      Multi-language in the frontend and backend Dynamic Rights: Editors can edit their locations in all available languages and the other content only in their own language Easy login with Google account via OAuth2 Plugin Uikit Frontend with SCSS built using PW internal features (find of files…) Custom Frontend Setup with Layout, Components, Partials and Snippets Only about 3 weeks development time from 0 to 100 (never having published a PW before) Despite multi-location multi-language requirement, the site is easy to use for both visitors and editors:  
       
        The search for a good CMS is over It’s hard to find a system that combines flexibility and scope with simplicity, both in maintainance and development. The search for such a system is difficult. By and large, the open source world offers you the following options:
      In most cases, the more powerful a CMS, the more complex the maintenance and development
      It is usually like that; The functionality of a system also increases the training and operating effort — or the system is easy to use, but is powerless, and must be reporposed for high demands beyond its limits.
      Quite different Processwire : You do not have to learn a new native language, you don’t have to fight plugin hell and mess around with the loop, you don’t have to torment yourself with system-generated front-end code or even learn an entierly new, old PHP framework .
      All our basic requirements are met:
      Custom Content Types Flexible and extensible rights-management Multilanguage Intuitive backend Well curated Plugin Directory Actually working front-end editing Simple templating with 100% frontend freedom In addition, Processwire has an exceptionally friendly and helpful community. As a rule of thumb, questions are answered constructively in a few hours . The development progresses in brisk steps , the code is extremely easy to understand and simple. Processwire has a supremely powerful yet simple API , and for important things there are (not 1000) but certainly one module which at least serves as a good starting point for further development. Last but not least, the documentation is easy to understand, extensive and clever .
      Our experience shows, that you can find a quick and simple solution with Processwire, even for things like extending the rights-management — most of the time a highly complex task with other systems.
      This is also reflected positively in the user interface. The otherwise so “simple” Wordpress crumbles when coping with more complex tasks. It sumbles over its apparent simplicity and suddenly becomes complex:
       
      Old vs. New — Simple and yet complicated vs. easy and hmmm … easy
          Our experience with Processwire as first-timers
      Before we found out about Processwire, we found CraftCMS on our hunt for a better CMS. We were frustrated by the likes of Typo3, WP or Drupal like many here. CraftCMS looked very promising but as we were digging deeper into it, we found it did not met our requirements for some big projects in our pipeline that require many different locations, languages and features. Initially we were sceptical about Processwire because;
      A. CraftCMS Website (and before UiKit also the admin interface) simply locked much nicer and
      B. because it is built on top of a Framework
      It was only later, that we found out, that NOT depending on a Framework is actually a very good thing in Processwire's case. Things tend to get big and cumbersome rather then lean and clean. But now we are convinced, that Processwire is far superior to any of the other CMS right now available in most cases.
      The good
      Processwire is the first CMS since time immemorial that is really fun to use (almost) from start to finish— whether API, documentation, community, modules or backend interface. Every few hours you will be pleasantly surprised and a sense of achievement is never far away. The learning curve is very flat and you’ll find your way quickly arround the system. Even modules can be created quickly without much experience.
      Processwire is not over-engineered and uses no-frills PHP code — and that’s where the power steams from: simplicity = easy to understand = less code = save = easy maintanance = faster development …
      Even complex modules in Processwire usually only consist of a few hundred lines of code — often much less. And if “hooks” cause wordpress-damaged developers a cold shiver, Hooks in Processwire are a powerful tool to expand the core. The main developer Ryan is a child prodigy — active, eloquent and helpful.
      Processwire modules are stable even across major releases as the code base is so clean, simple and small.
      There is a GraphQL Plugin — anyone said Headless-CMS?!
      Image and file handling is a pleasure:
      echo "<img src='{$speaker->image->size(400, 600)->url}' alt='{$speaker->fullname}' />"; I could go on all day …
      The not soooo good
      Separation of Stucture and Data
      The definition of the fields and templates is stored in the database, so the separation between content and system is not guaranteed. This complicates clean development with separate live- and development-environments. However, there is a migration module that looks promising — another module, which is expected to write these configurations into the file system, unfortunately nuked our system. I'm sure there will be (and maybe we will invest) some clever solutions for this in the future. Some inspiration could also be drawn here, one of the greatest Plugins for WP: https://deliciousbrains.com/wp-migrate-db-pro/
      Access rights
      The Access-Rights where missing some critical features: Editors needed to be able to edit pages in all languages on their own location, and content on the rest of the page only in their respective language. We solved it by a custom field adding a relation between a page the user and a role that we dynamically add to the user to escalate access rights;
      /** * Initialize the module. * * ProcessWire calls this when the module is loaded. For 'autoload' modules, this will be called * when ProcessWire's API is ready. As a result, this is a good place to attach hooks. */ public function init() { $this->addHookBefore('ProcessPageEdit::execute', $this, 'addDynPermission'); $this->addHookBefore('ProcessPageAdd::execute', $this, 'addDynPermission'); } public function addDynPermission(HookEvent $event) { $message = false; $page = $event->object->getPage(); $root = $page->rootParent; $user = $this->user; if ($user->template->hasField('dynroles')) { if ($message) { $this->message('User has Dynroles: '.$user->dynroles->each('{name} ')); } // for page add hook… if ($page instanceof NullPage) { // click new and it's get, save it's post… $rootid = wire('input')->get->int('parent_id') ? wire('input')->get->int('parent_id') : wire('input')->post->parent_id; if ($message) { $this->message('Searching Root '.$rootid); } $root = wire('pages')->get($rootid)->rootParent; } elseif ($page->template->hasField('dynroles')) { if ($message) { $this->message('Page "'.$page->name.'" has Dynroles: '.$page->dynroles->each('{name} ')); } foreach ($page->get('dynroles') as $role) { if ($role->id && $user->dynroles->has($role)) { if ($message) { $this->message('Add dynamic role "'.$role->name.'" because of page "'.$page->name.'"'); } $user->addRole($role); } } } if (!($root instanceof NullPage) && $root->template->hasField('dynroles')) { if ($message) { $this->message('Root "'.$root->name.'" has dynamic roles: '.$root->dynroles->each('{name} ')); } foreach ($root->get('dynroles') as $role) { if ($role->id && $user->dynroles->has($role)) { if ($message) { $this->message('Add dynamic role "'.$role->name.'" because of root page "'.$root->name.'"'); } $user->addRole($role); } } } } } With the Droles and Access Groups Modules we were not able to find a solution.
      I thought it was hard to get absolute URLs out of the system — Ha! What a fool I was. So much for the topic of positive surprise. (Maybe you noticed, the point actually belongs to the top.)
      But while we’re at it — that I thought it would not work, was due to a somewhat incomplete documentation in a few instances. Although it is far better than many others, it still lacks useful hints at one point or another. As in the example above, however, the friendly community quickly helps here.
      processwire.com looks a bit old-fashioned and could use some marketing love. You notice the high level to moan with Processwire.
      There is no free Tesla here.
      Conclusion
      Processwire is for anyone who is upset about any Typo3, Wordpress and Drupal lunacy — a fresh breeze of air, clear water, a pure joy.
      It’s great as a CMF and Headless CMS, and we keep asking ourselves — why is it not more widely known?
      If you value simple but clean code, flexibility, stability, speed, fast development times and maximum freedom, you should definitely take a look at it.
      You have to like — or at least not hate PHP — and come to terms with the fact that the system is not over-engineerd to excess. If that’s okay with you, everything is possible — with GraphQL you can even build a completely decoupled frontend.
      We are convinced of the simplicity of Processwire and will implement future sites from now on using it as a foundation.
      Links & resources we found helpful
      API documentation and selectors API cheatsheet pretty handy, not quite complete for version 3.0 Captain Hook Overview of Hooks Weekly.PW newsletter a week, exciting Wireshell command line interface for Processwire Nice article about Processwire Plugins & Techniques that we used
      Custom Frontend Setup with Uikit 3 and SCSS, and Markup Regions Uikit Backend Theme ( github ) Oauth2 login modules In-house development Login with E-Mail Pro Fields for repeater matrix fields (infos, price tables, daily routines) Wire upgrade to update plugins and the core Wire Mail Mandrill to send mails FunctionalFields for translatable front-end texts that are not part of a content type (headings, button labels, etc.) Runtime markup for dynamic backend fields (combination of first and last name) Tracy debugger for fast debugging Textformatter OEmbed to convert Vimeo and Youtube links into embed codes HideUneditablePages thanks to @adrian  
       
×
×
  • Create New...