totoff Posted January 23, 2013 Share Posted January 23, 2013 hi forum, i'm sure that has been asked and answered before, but can't find it in the forum jungle, not even via google ... therefore in short: i display a list of questions and answers on a faq-list-page. the data for each faq comes from child pages of the faq-list: /faq-list /faq-1 /faq-2 and so on. as i don't need the child pages for more than holding the data and don't want it to be indexed by google, i would like to hide it. however, if i set it to hidden in settings i can't retrieve their data anymore for my faq-list-page. does anybody know a topic here in the forum that addresses this or has a workaround? thanks, christoph Link to comment Share on other sites More sharing options...
Joss Posted January 23, 2013 Share Posted January 23, 2013 Hi Christoph Although you might hide the parent (and therefore the children will not appear on the menu) the children and the parent are still published. If you look through the Cheat Sheet you will see you can call a specific page and then the children. So, for instance $pages->get("/path/to/page/")->children - will retrieve all the children to a specific page. Now you can create a foreach loop to retrieve the fields you require from the children and display it however you wish. http://processwire.com/api/cheatsheet/ Link to comment Share on other sites More sharing options...
Soma Posted January 23, 2013 Share Posted January 23, 2013 The magic selector include=hidden does the trick. 3 Link to comment Share on other sites More sharing options...
Soma Posted January 23, 2013 Share Posted January 23, 2013 Just for convenience: http://processwire.com/api/cheatsheet/?filter=hidden&advanced Link to comment Share on other sites More sharing options...
Joss Posted January 23, 2013 Share Posted January 23, 2013 Ooh, just learned something! Thanks. Link to comment Share on other sites More sharing options...
diogo Posted January 23, 2013 Share Posted January 23, 2013 Just don't create a template file for these pages template, and their urls will throw a 404 1 Link to comment Share on other sites More sharing options...
totoff Posted January 23, 2013 Author Share Posted January 23, 2013 hi folks, thanks again for addressing my thread. i think i'll go the "magic selector route" as i don't want the pages to be indexed by search engines and than throw a 404. Link to comment Share on other sites More sharing options...
Jennifer S Posted January 30, 2013 Share Posted January 30, 2013 Once the content from the hidden sub-pages has been included on a regular page, does it get indexed by the search engine? Link to comment Share on other sites More sharing options...
Jennifer S Posted January 30, 2013 Share Posted January 30, 2013 I think I mean is it possible to index the site after content includes have been rendered by the individual pages. So that my include shows up as a result on its own, and as part of the page it is included in. Link to comment Share on other sites More sharing options...
apeisa Posted January 30, 2013 Share Posted January 30, 2013 I think I mean is it possible to index the site after content includes have been rendered by the individual pages. So that my include shows up as a result on its own, and as part of the page it is included in. Of course, if your individual page has url that is reachable by search engine. Link to comment Share on other sites More sharing options...
SiNNuT Posted January 31, 2013 Share Posted January 31, 2013 I think you already have your solution but on a sidenote; if it's only about hiding stuff from search engines you could just add <meta name="robots" content="noindex, nofollow"> in the head section of the faq entry html. Link to comment Share on other sites More sharing options...
JeffS Posted February 4, 2013 Share Posted February 4, 2013 Don't forget that visiting a page with chrome or the google toolbar, that is not indexed in G, will invite a crawl from googlebot. For SEO control I usually always add a field that allows the editor to set the correct meta robots for the page in the admin. And in code I will usually cascade or inherit the value of the parent unless override was present. 1 Link to comment Share on other sites More sharing options...
Soma Posted February 4, 2013 Share Posted February 4, 2013 Don't forget that visiting a page with chrome or the google toolbar, that is not indexed in G, will invite a crawl from googlebot. If that would be true, all my development sites would be in google index, but they aren't. 1 Link to comment Share on other sites More sharing options...
JeffS Posted February 4, 2013 Share Posted February 4, 2013 @soma- I said invite. I really should have said "could". I did more additional research on this and G say's it won't and Matt Cutts debunked this a while back. I had seen googlebot on some dev sites and was sure they had not been shared. Could have been a tweet or some other crawled resource that shared it. I stand corrected. Note that they do collect URL's in chrome if you have Google as the default search engine (for typeahead or missing URL's), and in the Google Toolbar when "page rank" is enabled the URL's are phoned home. Just not shared . Yet 1 Link to comment Share on other sites More sharing options...
totoff Posted March 6, 2013 Author Share Posted March 6, 2013 i'm coming back to this thread as i still struggle to find the optimum solution for the problem i described in my first post. to sum up, as far as i understand there are four options to use child pages as "data containers" without making them viewable or have them indexed by google: keep the children hidden/unpublished but retrieve their data with selector include=hidden -> not the optimum as it confuses clients ("why is this page hidden") don't assign a template file to the page template -> throws a 404, possible, but not the most elegant solution imho set a 301 to their parent -> unsure on the seo effects and may cause trouble if the page tree needs to be changed robots.txt disallow /faq-1 etc. -> not applicable as it requires a static page tree from my opinion, option 3 seems to be the best - but still leaves me a bit dissatisfied ... i would be happy to hear your opinion or suggestions how you would solve this. thanks, christoph Link to comment Share on other sites More sharing options...
Soma Posted March 6, 2013 Share Posted March 6, 2013 Option 1 or better 2. is the most elegant if you don't need to be able to view those anyway. If you don't have links or urls to those data pages, spider will not find them anyway, so nothing to worry about. Link to comment Share on other sites More sharing options...
SiNNuT Posted March 6, 2013 Share Posted March 6, 2013 I think there's something to be said for the 'unpublished' option. If you look at this setting in the admin it says: 'Unpublished: Not visible on site'. It seems this is exactly what you want. Surely this can't be hard to explain to clients: "we keep faq items unpublished because we don't want them individually accessible on the website". Another option is a variation on 4: instead of trying to do this in robots.txt you can put <meta name="robots" content="noindex, nofollow"> in the head section of your faq template. This way you also keep spiders out and and it automaticaly applies to all pages using the faq template. Of course, they still would be url accessible but you wouldn't link to them on the site, nor would google index them, so no real problem. Just in case someone would visit a faq url you could make clear what's happening via the template output: "notice: this page is part of .... visit this page instead (link)" Link to comment Share on other sites More sharing options...
totoff Posted March 6, 2013 Author Share Posted March 6, 2013 Surely this can't be hard to explain to clients unfortunately it was. the quote above is an original. it was not that they didn't get it, they simply forgot about it and published the pages anyway. Link to comment Share on other sites More sharing options...
Soma Posted March 6, 2013 Share Posted March 6, 2013 Using published option is actually worst and most unelegant. You then can't use published anymore... Option 2 is the simplest and most elegant way to go. If you have no template file you can't view them directly. It's there for this reason. 1 Link to comment Share on other sites More sharing options...
Jennifer S Posted March 6, 2013 Share Posted March 6, 2013 Is there a way to manage this with user roles & permissions? Link to comment Share on other sites More sharing options...
Joss Posted March 6, 2013 Share Posted March 6, 2013 I tend to make sure that the parent is Hidden, then there is no automatic link to the kids (whatever state they are in) unless you create one. That way they don't accidentally become visible because someone did the wrong setting. If you want a catch all, you could always give them a template file but don't render any of the fields. You can then redirect that anywhere you wanted. I would think just keeping the parent Hidden will be easier though and means that the children (if published) are still accessible by page fields, if you need that later. Link to comment Share on other sites More sharing options...
totoff Posted March 7, 2013 Author Share Posted March 7, 2013 hi there, thanks for all your comments. interesting to see that there is such a wealth of opinions and different strategies. i tend to agree to soma whose approach "if you don't want to make it public, just don't assign a template file to it" convinces me most. also with regard to search enignes: no url, no links, 404 just in case = no problem. Link to comment Share on other sites More sharing options...
Lance O. Posted March 3, 2014 Share Posted March 3, 2014 (edited) Removed my reply and moved it into a new post. Edited March 3, 2014 by Lance Link to comment Share on other sites More sharing options...
Can Posted April 18, 2014 Share Posted April 18, 2014 (edited) Good morning guys Don´t really understand PW on this point. Am I understanding right that you guys mention to just set the status of the page or the parent to "Hidden: Excluded from lists and searches"? When I´m doing this on my test site the page is still accessible via the url, even when logged out. Only when I set the status to unpublished or delete the template file it´s throwing 404 As I understand right now, "hidden" does only hide from search results on the own page and on listings like navigations or something, right? But for this to be true. I just marked my contact page as hidden and pulled out some data via $pages->get('/contact') without including the "include=hidden" selector but it was still working. EDIT: okay $pages->get will include hidden pages as well as ryan said in this post And as I mentioned above, the page is accessible via url as well. (which I would understand because it´s only hiding from search results) And totoff is confusing me with this one #7 as well haha^^ Isn´t anything throwing a 404? I mean I could just add something random at the end of my url (example.com/somethingrandom) and getting a 404 So how can he (how can you) prevent it from showing a 404? Or are you doing a redirect? Hope someone can understand my confused brain and bring a little clarification in there which is really appreciated EDIT: At the moment I think it´s best for me to have a page without template file and set state to hidden to have it different in page tree list Cheers Can Edited April 18, 2014 by Can Link to comment Share on other sites More sharing options...
Wanze Posted April 18, 2014 Share Posted April 18, 2014 Hi Can, Am I understanding right that you guys mention to just set the status of the page or the parent to "Hidden: Excluded from lists and searches"? When I´m doing this on my test site the page is still accessible via the url, even when logged out. Only when I set the status to unpublished or delete the template file it´s throwing 404 Yep, that's the correct behavior. Hidden will exclude a page from $pages->find() calls, unless you specify "include=hidden" in your selector. Think of a hidden page like a page that should not be visible in your navigation or lists, but still accessible when one knows the direct URL. Unpublished really means that the page is not yet ready/published for the public, here a 404 is displayed. The same goes for pages with templates that do not have a physical template file associated. Be careful when logged in as superuser, if I remember correctly you'll see unpublished pages. In order to simulate the website for a guest visitor, you could use another browser or the private/incognito mode But for this to be true. I just marked my contact page as hidden and pulled out some data via $pages->get('/contact') without including the "include=hidden" selector but it was still working. $pages->get() is an explicit call to retrieve a page. ProcessWire assumes that you want to get it, no matter if it's hidden or not Isn´t anything throwing a 404? I mean I could just add something random at the end of my url (example.com/somethingrandom) and getting a 404 So how can he (how can you) prevent it from showing a 404? Or are you doing a redirect? Pw is throwing an 404 if you enter a path that does not exist. Or if a page you want to visit is unpublished or does not have a template file (because Pw does not know what markup to render). You can also throw a 404 anytime yourself, though that is already more advanced stuff. Could you maybe describe us what you want to do? Why would you want to prevent showing a 404? Cheers 4 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now