Jump to content

Module: Most Viewed Pages Tracker (UpdMostViewed)


Recommended Posts

Hi everyone,

We have a new module for you: the UpdMostViewed module. It's an addition to your ProcessWire site that enables you to track page views and deliver a list of your most visited pages within a given time range. This data is particularly useful for creating frontend features like a "Most Read Articles of the Week" widget.

 

Installation
You can choose to:

  1. Head into your ProcessWire backend, go to Modules > New and search for the Module UpdMostViewed.
  2. Get the module directly from the latest releases on our GitHub repository.
  3. If you're using Composer for your project, you can add the module by running the command composer require update-switzerland/updmostviewed in your project root directory.

Once downloaded, you can install the module via the ProcessWire admin.

 

Configuration
The UpdMostViewed module provides you with a variety of configuration options. You can exclude certain branches, specific pages, certain IPs, restrict counting to specific templates, define which user roles to count, ignore views from search engine crawlers, and more.

Moreover, you can also customize the time ranges for the "most viewed" data.

 

Links
For more detailed information on usage, updates, or to download the module, please visit our GitHub repository and check out the module in the Module Directory.

 

UpdMostViewedConfig.thumb.png.bb9340ac7f08d25430c5652d2631f90d.png

  • Like 11
Link to comment
Share on other sites

This is really great! I want to be able to shift some priority in featured content around websites based on visibility/exposure to visitors, like "featured pages" or blog posts and this info will help provide the information needed to build that more robustly. I've tinkered in the past with $page->meta() but this looks very robust. Thank you for putting this together!

Link to comment
Share on other sites

So... I just installed it on a quite crowded site. Is there a way to see in the backend when there is enough data collected to show somewhere? Or should I just wait for a week or two?

Maybe it's me but I'm not sure what this means. Sure 1 day, 2 days, 3 days... but at the end?

image.thumb.png.08b338f3009d2fff6a686212f4f4e45c.png

Can't wait to see what's happening on my site and to show it.

Oh... what I already love is that I can put in different sections with different options like /blog/, /tutorials/, and /such/.

 

You guys worked on this for quite some time, right?

Link to comment
Share on other sites

Hi @wbmnfktr thanks for your feedback!

7 hours ago, wbmnfktr said:

So... I just installed it on a quite crowded site. Is there a way to see in the backend when there is enough data collected to show somewhere? Or should I just wait for a week or two?

Maybe it's me but I'm not sure what this means. Sure 1 day, 2 days, 3 days... but at the end?

Here's briefly how it works with the default values of 1 day, 2 days and 3 days:

Think of the 1st timerange as plan A - it fetches the 10 most viewed pages in the last 24 hours. If for whatever reason there haven't been 10 pages viewed in the past 24 hours, we then move to plan B, which is the 2nd timerange. Here, we look into the past 48 hours for page views.

The 3rd timerange is plan C, in case we still need more articles to complete our list of "10 most viewed pages". So really, the 2nd and 3rd timeranges are like our safety nets to always make sure that we have a full list of 10 pages, regardless of viewer traffic.

But given your site is quite crowded, it's likely that the list will be filled up within the first 24 hours, so you may not ever even need the 2nd and 3rd timeranges.

Regarding the waiting time post-installation: it all depends on your timeranges. If you've set them as 1 day, 2 days, and 3 days, then ideally you'd wait for 3 days. But in practice, if your site has a good traffic volume, you may start seeing your top articles much sooner!
If you adjust your first timerange to be 7 days and 8/9 days as a fallback (10080, 11520 and 12960 minutes) you'd ideally wait for 7-9 days.

Does that make sense?
 

7 hours ago, wbmnfktr said:

Ok, not sure who to ask here but PageHitCounter access to its counter is logged here as 404 - which could be and maybe is correct. Yet I don't know how or if this will mess up MostViewed counts.

I've tidied up the way we log events to prevent any confusion. Just to clarify:
- 404 page-views don't contribute to the count and so, to avoid clutter, they've been removed from the logs, even while debugging.
- another reason for removing 404 from the logs: every missing image, file, etc triggers a 404 - which can and surely will eventually be a lot.
- the tracking of our module isn't affected by what PageHitCounter does, because it's triggering a 404 event that we ignore either way.

 

I've also updated the logs to include the reason each time a page-view gets passed over.
This means not only will you know which views have been tracked, but also which ones weren't tracked and exactly why.

logging.png.8e2580fea2fc90d83ac313e1c0a47f9a.png

 

7 hours ago, wbmnfktr said:

You might want to add ScreamingFrog as a bot/crawler if possible.

Thanks for that - I've also updated the check for crawler. I'm using the same method as PageHitCounter - this seems like a reliable approach.


I have updated the GitHub Repository with the most recent changes. You can check the releases section there to see them.
Alternatively, you should be able to update the module directly from the module configuration in your ProcessWire installation by clicking the link 'check for updates' right next to the module-version.

 

Hope that helped!
Cheers

  • Thanks 1
Link to comment
Share on other sites

10 hours ago, wbmnfktr said:

So... I just installed it on a quite crowded site. Is there a way to see in the backend when there is enough data collected to show somewhere? Or should I just wait for a week or two?

Maybe it's me but I'm not sure what this means. Sure 1 day, 2 days, 3 days... but at the end?

image.thumb.png.08b338f3009d2fff6a686212f4f4e45c.png

Can't wait to see what's happening on my site and to show it.

Oh... what I already love is that I can put in different sections with different options like /blog/, /tutorials/, and /such/.

 

You guys worked on this for quite some time, right?

It should work right away. You should see the tracked pages here: Menu "Setup" > "Most Viewed".

  • Like 1
Link to comment
Share on other sites

On 6/15/2024 at 8:24 AM, update AG said:

Here's briefly how it works with the default values of 1 day, 2 days and 3 days:
[...]

Does that make sense?

Absolutely, yes. Makes totally more sense now to me now.

The lists fill up nicely now. Forgot to delete my cache (ProCache) in-between but now everything looks good.

lcrlog.png.275f19656494bfbe476273152e09b511.png

The log is much cleaner now and ScreamingFrog gets detected. Nice!

And finally found the overview.

Awesome!

  • Like 1
Link to comment
Share on other sites

It's been a while since I installed your module and just kept it in place. Looking into the Most viewed tab today I got confused again.

The 24-hour tab is way too empty for my taste and is off compared to my analytics, which already only receives a small part of user views/impressions.

2024-06-23_22-03.thumb.png.bca96b223148477ce760e8dbf6225393.png

The 48-hour tab looks better but I still don't trust those numbers.

2024-06-23_22-04.thumb.png.c721c48bb4f112ffa3077b50ae136b83.png

 

What part am I missing here? Is it still my understanding of those numbers or could there be something wrong in my setup for your module? Your module runs in two projects which are almost similar to identical in terms of setup and modules. Perfect for testing and to play around with. So it's not critical at all and more kind of a feedback here.

  • ProcessWire 3.0.240
  • PHP 8.2
  • ProCache (latest)
  • NO URL Hooks in place
  • On one instance: TemplateEngineTwig by @Wanze for templating and rendering pages

 

Link to comment
Share on other sites

I'm a bit puzzled about what might have gone wrong here since your project settings appear correct.
The module is also compatible with ProCache enabled, so that shouldn't be an issue. Could you please share your Module-Config so I can analyze the problem further?

  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

Hey @update AG Markus,

thank you for this module. It fits my needs perfectly. Unfortunately, I have a problem with the assignment of permissions in the backend: The page isn't shown up when viewing as an "editor", see screenshots:

1478445584_2024-07-1109_02_51-Roles_editorschmidta2.vieregg.designMozillaFirefox.thumb.png.c46b429ff3109060d04ba3e1009eef19.png

I'm not able to view this page (/cms/setup/most-viewed/) with editor role:

22076513_2024-07-1109_21_38-SeitenProcessWireschmidta2.vieregg.designMozillaFirefoxPrivaterModus.thumb.png.50bc7849cce85ee9af319c8f41a71bc1.png

Any ideas?

Greets Jens alias DV-JF

Link to comment
Share on other sites

On 6/27/2024 at 4:42 PM, wbmnfktr said:

Hi @wbmnfktr sorry for the late response - I thought I clicked on "Submit Reply"...

I've made some deeper monitoring on our website's traffic where we're actively using this module. We do indeed count some less views than our analytics shows us.
But reading throught our modules' code I would not see an issue where actual page views won't get count.

At this point I genuinely assume that the module simply filters out more crawlers than our analytics.

Did you figure something out on your end in the meantime?

  • Like 1
Link to comment
Share on other sites

Hi @DV-JF

Seems like we missed a line in the module config in ProcessUpdMostViewed.module. Fixed it in the most recent Commit 51ecafc. You can either download the ProcessUpdMostViewed.module file again, or add this line of code manually in the getModuleConfig function just below permissions yourself.

'permission' => self::PAGE_NAME,

It'll be fixed on the next version bump.

You might have to reinstall the module to get the permissions to work.
I haven't figured out yet if there is somehow an option to keep the database table when uninstalling the module to not lose any data on reinstalling the module.
Therefore you might want to export the table manually if you do not wish to lose page-view data.


Hope that helps!
Cheers

Link to comment
Share on other sites

5 hours ago, update AG said:

Did you figure something out on your end in the meantime?

I looked into the stats once in a while and tried to figure out whats happening but still have no clue.

Just for yesterday (2024-07-10) the overall numbers are this:

  • MostViewed logs: 19-28 views tracked (only 3 detected bots and therefore no tracking)
  • MostViewed overview: 20 views listed
  • Google Search Console: 79 clicks
  • Analytics: 132 page views (107 unique visitors)

MostViewed's numbers are fine within itself. The 24 hour range is a bit blurry in the logs so I would say this is ok.

Looking into the 79 registered clicks in Google Search Console and comparing those with 107 unique visitors in my analytics does make absolute sense when looking into the sources/details which show 83 hits from Google overall and some other sources.

So it's not that the module catches too many bots or that the numbers are just slightly off. The views don't even reach the module. I thought about ProCache as an issue but you already cleared that. Cloudflare CDN shouldn't be a problem as well - that would filter out absolutely everything.

Here comes the fun part: MostViewed tracks my RSS feed, which isn't tracked/listed/shown in Google Search Console or web analytics and that number is 100% on point.

That RSS feed is not cached by ProCache, just standard cache for 1 hour, it's not linked anywhere in the frontend, it's just plain PHP and some RSS/XML output and it doesn't use any templating engine.

There are two last things that I can think of that might cause this:

  1. HTMX Preload Plugin - some weird AJAX content switching voodoo
  2. ProCache - still, as it bypasses almost everything and goes straight to static files

mv72.png

mv48.png

mv24.png

gsc.png

pirsch.png

Link to comment
Share on other sites

On 7/11/2024 at 4:36 PM, wbmnfktr said:

There are two last things that I can think of that might cause this:

  1. HTMX Preload Plugin - some weird AJAX content switching voodoo
  2. ProCache - still, as it bypasses almost everything and goes straight to static files

So... yet another short update here.

2 sites - almost identical in terms of ProcessWire and overall setup - both have issues, only one uses HTMX for preload/prefetch, both use aggressive ProCache settings (cache lifetime 1week+).

I'll rule out that HTMX as the issue here.

I deleted the MostViewed-log file on both sites, after putting all pages in cache, only un-cached pages were registered anywhere.

Can you please double-check and verify on your own site that it still works on pages that are cached in ProCache.

A few hours later... [Sponge Bob - sequence voice]

1985373323_Screenshotfrom2024-07-1306-01-20.thumb.png.c6b7f32ce9277a56d2c872028e37b1c5.png

 

Link to comment
Share on other sites

Interesting... I would've thought that HTMX could be an issue.

We have ProCache enabled for every page, but we still get page views.
When opening a private tab and clicking through our site I can see every click I made in the Dashboard. So I am genuinely confused on how it would not work on your end.

When automated page view tracking is enabled, we're writing the page view inside the modules ready function.
This should fire on every page load, as the module is being autoloaded.
Do you see an issue here?

  • Like 1
Link to comment
Share on other sites

7 hours ago, update AG said:

We have ProCache enabled for every page, but we still get page views.

What's your cache's lifespan? I use 604800=1 week for almost everything. Homepage cleares after an hour, RSS isn't cached at all. Both are the only ones that receive any counts at all.

7 hours ago, update AG said:

we're writing the page view inside the modules ready function

That only triggers when ProcessWire/the module actually boots up. So on first uncached page load. After that neither ProcessWire, nore the module know anything about page loads due to ProCache magic accessing the cached versions.

In conclusion you might receive more counts but probably have a way lower lifespan set in ProCache or parts of your page's cache are more often invalidated.

I most often publish in chunks once a week so there isn't much happening in terms of deleting the cache.

 

7 hours ago, update AG said:

Interesting... I would've thought that HTMX could be an issue.

Was my first thought as well!

Link to comment
Share on other sites

We're normally using the default lifespan of 1 hour (3600s). But even after bumping this up to 604800=1 week, we don't experience any changes.
I can reload the page as many times as i want - on my computer, phone, whatever - it still counts every single page view.

45 minutes ago, wbmnfktr said:

That only triggers when ProcessWire/the module actually boots up. So on first uncached page load. After that neither ProcessWire, nore the module know anything about page loads due to ProCache magic accessing the cached versions.

Are you sure about this? Of course ProCache stores HTML versions of the cached page and serves these instead of re-rendering the page from PHP.
But wouldn't ProCache need to access the ProcessWire API to check if ProCache is even enabled or not? If that's the case, then the MostViewed module should be loaded as well, as we're accessing the ProcessWire API at the same stage as ProCache. At the ready function of an autoload module - which is executed immediately after the ProcessWire API is ready.

Or is ProCache exiting this process as soon as it realizes it has a cached page for the current request? If so, could I prioritize the order of autoloaded modules?
That would currently be my only other guess...

Though it's weird that it's working just fine on our end.

Link to comment
Share on other sites

2 hours ago, update AG said:

I can reload the page as many times as i want - on my computer, phone, whatever - it still counts every single page view.

I bet it does but not on the homepage. I skipped through your website and the only page that is cached by ProCache, is the homepage. At least that's the only one I found.

uag-home.png.e58ae7c6c6f3291f29d09905783ef00c.pnguag-agentur.png.41d8678fb74a0b72d3e7b9e5765a248c.png

 

2 hours ago, update AG said:

Are you sure about this?

Pretty sure as ProCache bypasses all PHP. Just checked it with this.

public function init()
{
    if ($this->user->isLoggedIn()) return; // necessary as the backend would trigger this as well
    $this->wire->log->save('debug', "from init");
}

public function ready()
{
    if ($this->user->isLoggedIn()) return; // necessary as the backend would trigger this as well
    $this->wire->log->save('debug', "from ready");
}

 

2 hours ago, update AG said:

But wouldn't ProCache need to access the ProcessWire API to check if ProCache is even enabled or not? If that's the case, then the MostViewed module should be loaded as well, as we're accessing the ProcessWire API at the same stage as ProCache. At the ready function of an autoload module - which is executed immediately after the ProcessWire API is ready.

I don't know the details about cache maintenance in ProCache but it's more than the init/ready methods.

 

2 hours ago, update AG said:

Though it's weird that it's working just fine on our end.

It absolutely is.

 

And yes... you could prioritize hooks but that wouldn't help in this case - I guess:
https://processwire.com/docs/modules/hooks/#hook-priority

Edited by wbmnfktr
Added link to Hook docs.
Link to comment
Share on other sites

That's even more confusing then. We have ProCache installed normally with no exceptions for pages.
But even if that's the case - shouldn't then at least the homepage be bugged for page view counting? Because we don't experience any difference compared to subpages. Every page load (or reload) on the homepage is being counted on our site.

At this point I'm seriously at a loss of ideas.
 

13 hours ago, wbmnfktr said:

And yes... you could prioritize hooks but that wouldn't help in this case - I guess:
https://processwire.com/docs/modules/hooks/#hook

I agree, I don't think that's the solution to our problem.

I'll continue to check if there's a workaround for your issue.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...