Jump to content

Why not pre-cache on save, instead of on request? (possible?)


joe_g
 Share

Recommended Posts

Hi,

This is a general CMS question, but since this is my favourite CMS, I'll ask it here.

Instead of generating a page once the visitor ask for it, wouldn't it be better to generate it directly when the editor press 'save'?

Then, all server side performance issues are gone completely. It totally wouldn't matter how long a page takes to generate, and you could focus on making the most maintainable structure instead of having to worry about performance. All request would be procache-level speed.

I understand this would only work for content-driven sites, and not more complicated things like services, but still it would cover a lot of user cases.

I also understand that you would need to keep track of all pages that depends on certain data, but that doesn't strike me as very complicated - even if it has to be done manually (compared to the big advantage of virtually perfect response time, all the time)

I'm writing this because I'm working on a site where the structure is on the limit of what is possible to do with a CMS (processwire or any other) without things becoming too slow, and it struck me that this construction would eliminate all my worries about performance. Would this be possible to make by doing a hook to save page or similar?

j

Link to comment
Share on other sites

Caching upon first request really isn't what makes PW's built-in (template) cache slow compared to ProCache. It's only the first visitor to request a certain page that will have to wait slightly longer. All other visitors, and the first visitor upon revisit, are given the cached page within your defined cache time. What makes this relatively slow is that the built-in cache has to hit PHP/MySQL to figure stuff out, whereas ProCache completely eliminates this need with some clever and efficient Apache rules.

I don't think there's a lot to be gained with caching on page save.

Link to comment
Share on other sites

Whether cached at view or save, t's exactly the same amount of work either way. Though caching at save potentially creates more work. Whether we're talking about template cache or ProCache, there are a few reasons why it's better to do it upon first view, rather than upon save. 

  • Pages are only cached for guest users. If we performed an automatic cache after a page save, it would be within the context of the user editing the page. This is not a context we want to cache. 
  • You want the page to be cached within the context of a real page view, not from something automatic or behind the scenes. This ensures that any other modules that are part of the render process also get to participate.
  • You may save a page multiple times in a short period of time (don't we all?). If it gets cached on every save, you are potentially using a lot more resources than if it was just cached the first time it was viewed from a guest user.
  • If we regenerated on save rather than view, that would be a whole lot of extra saves that have to be performed in the system (a save uses a lot more resources than a view). Keep in mind cache files automatically expire after some period of time (defined by you). You can't rely purely on a save() for knowing when data is stale or not, as your site may be pulling data from multiple pages or other resources. 
  • Like 5
Link to comment
Share on other sites

  • 2 weeks later...

Thanks for the insights, Ryan. I figured there were good reasons, since virtually every CMS works this way.

Ultimately what I'm mafter is to decouple editing and visiting, like, for example serving the site as static files from Amazon S3, but I understand this would be a whole different approach to website-making (with a number of issues).

Link to comment
Share on other sites

  • 1 year later...

Why you talking so much about context?

The only purpose for me to cache before first load is provide faster page loading for guests (they are the most part of site visitors).
That's very important if you need to update site content many times a day (for example at news sites).

I thought ProCache has such feachure, but it doesn't. So i have to work hard on crontab tasks that will crawl entire site every minute to make page load fast for EVERY user.

Why you talking about resources in times when hardware becomes cheaper every day?

User experience is more valuable than resources.

The last thing: why should i care about cache expiration? It should be updated ("expired") only if it's content is changed (and then cached again). 

Link to comment
Share on other sites

Why you talking so much about context?

Because it's important. Things might not work exactly same for logged-in users, let alone superusers, as they do for guest users. Permissions, for an example, are one important consideration, and many sites also have separate login details, admin tools, etc. visible only when you're logged in. In other words: if you render a page for caching while logged in, that's a potential problem right there.

How much this really matters depends on the case, of course, but it's better to be safe than sorry, especially when we're dealing with assumptions that affect a) security, and b) a huge amount of ProcessWire users out there.

The only purpose for me to cache before first load is provide faster page loading for guests (they are the most part of site visitors).

That's very important if you need to update site content many times a day (for example at news sites).

As long as a given site is even moderately popular, meaning that it gets anywhere between tens and hundreds of page views each day, one visitor taking couple of seconds longer to render a given page (and that would already make it an *extremely* heavy page) makes little difference.

You need to consider the big picture here. While the "every visitor matters" approach is admirable, and very much true, when you put things in perspective, single page load taking slightly longer isn't going to bring your site instantly crashing down or drive your users away for good.

*Don't succumb to the trap of over-optimisation.* Most likely there are a lot of other things you could do with that time that would benefit your site much more.

I thought ProCache has such feachure, but it doesn't. So i have to work hard on crontab tasks that will crawl entire site every minute to make page load fast for EVERY user.

You need to put things in perspective here. In my humble opinion you're sacrificing a lot of time to do something that won't make noticeable difference considering the big picture. If that makes you happy, then by all means do it, of course  :)

Why you talking about resources in times when hardware becomes cheaper every day?

Hardware may be cheaper now than ever, but it's still far from free, and while you can run a small site for relatively cheap, you need to consider scalability too. Large and/or more popular sites require more resources, which in turn means more costs.

So far I've never heard a client say "I don't care about costs at all". Quite the opposite, actually. In the end, one part of your job as a developer is keeping the costs low for the clients you work for.

User experience is more valuable than resources.

Again, you need to put things in perspective. User experience is extremely important, but resources matter too. Every choice you make has pros and cons, and you need to keep those in balance. Admittedly user experience is much harder to measure than resources and their costs, so the decisions are not always obvious..

The last thing: why should i care about cache expiration? It should be updated ("expired") only if it's content is changed (and then cached again).

What you're saying here is roughly how it works, though especially in larger use cases caching everything on each page save could literally render the site unusable. At the very least it could mean that each save action is extremely slow, which is not a good thing either.

There's a lot more to cache expiration than this. ProcessWire makes it easy to use content from other pages, and output lists of pages, etc; these need to be considered when selecting which pages the cache is invalidated for. In many cases it's not nearly enough to just invalidate the cache for the page you've saved. One might also be pulling content from other sources, and outputting content that is time-sensitive (such as dates, events, etc.)

Hope this helped clarify things a bit!

  • Like 4
Link to comment
Share on other sites

I was going to answer, but Teppo did before. Good :)

I still would like to say that all you're questions had quite obvious answers and that you asked them in a quite rough way, in the context of being the one that's using an excellent free tool, asking these questions to the one that provides this tool to you for free (repetition intended) and takes the time to give such complete explanations to every question that comes up.

Anyway, if you're worried about guests, just make sure your editors are the first guests after saving a page, by telling them to do it.

  • Like 4
Link to comment
Share on other sites

...just make sure your editors are the first guests after saving a page, by telling them to do it.

This is exactly what I do on one site - different browser open and load the page as a guest. It helps to do this anyway I think as there can be times where things may display fine for logged-in users but not for guests (usually when I put some sill code in and it breaks something ;)).

Link to comment
Share on other sites

Thank you guys for your answers. I didn't want to be rough.

Let's imagine that you have:

  • latest articles on main page
  • list of related articles on several other pages
  • one full article page

If you'll update just one page (article) it would affect all other connected pages. That's rather more than just one user issue.

Captain Earth: Saving a page is not a problem, because it happens not so often and doesn't affect guest users (who is the most part of audience)

Hero Member: It's more simplier for me to tell editor do not save page too fast than ask him to crawl all over pages affected by one edited article. I don't understand why are you talking about free tool, because i needed to buy ProCache for using it.

Sorry, i didn't notice that it is not ProCache thread.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...