Roych Posted yesterday at 06:44 AM Author Posted yesterday at 06:44 AM 7 hours ago, adrian said: Sorry, I might start becoming a bit of an annoyance, but what do you think about removing anayltics data that is added to every page's edit interface if the page isn't viewable (no template file, etc), because it will always show zeros. There is already a setting for this in the "module settings" to disable this on every page. 😉 7 hours ago, adrian said: I am seeing a LOT of bot traffic showing up in the stats - it's effectively making the data useless. .............. Already working on this, and will probably post an update later today. Working on: Aggressive bot detection: IP blocklist, 404 filter and some other updates.. 😉 R 1
Roych Posted yesterday at 07:55 AM Author Posted yesterday at 07:55 AM just a small update about the next NativeAnalytics version, v1.0.24. A lot has already been done, but I’m still testing and polishing everything before I publish the full release. One of the main things I’ve been working on is better bot and crawler filtering. NativeAnalytics should now do a much better job at filtering out AI crawlers, SEO bots, social preview bots, uptime monitors, common HTTP libraries and other automated requests, so the analytics data should be cleaner. I also added support for the Matomo device-detector library. Thanks to @adrian The module can use a site-wide Composer installation if available, or fall back to the bundled version included with the module. There is now also a clearer status section in the module settings, so it is easier to see what detection method is currently being used. 404 handling has also been improved. The module now tries to avoid logging false 404s when the URL can actually be resolved by modules such as PagePathHistory, ProcessRedirects or Jumplinks. There are also two new cleanup buttons: Cleanup resolvable 404s Cleanup suspicious probes Suspicious probes are now detected better as well. This includes common scanner URLs like WordPress/Joomla/Drupal/Magento/admin login probes, .env, .git, config files, shell upload attempts, path traversal attempts and similar noise. I also added an optional URL/path filter, so if someone wants to exclude some custom patterns from tracking, this can now be done directly from the module settings. The module settings page has also been cleaned up and reorganized. It should now be much easier to understand, with clearer sections for tracking, filters, bot detection, privacy/consent, retention, reports and advanced settings. There are also several smaller improvements around realtime visitors, IP blocking, cleanup tools, bundled library fallback handling and general admin/dashboard styling. This is not the final release post yet, but the next upgrade is getting close and should be available soon. 3
Roych Posted yesterday at 12:39 PM Author Posted yesterday at 12:39 PM NativeAnalytics 1.0.24 is out Major update focused on getting rid of bot noise that was inflating stats and skewing single-page rates. What's new: Bundled matomo/device-detector — thousands of bot signatures, smart TV / mobile / console detection. No Composer required, but a site-wide Composer install is auto-detected and preferred. JavaScript-first tracking mode (new default for fresh installs) — server-side recording disabled by default, JS tracker is the source of truth. Cuts bot traffic by 60–80% because bots don't execute JS. Existing installs keep the legacy "Both" mode on upgrade — switch in module settings when ready. Stronger 404 / scanner filtering — IP-rate-limited 404 scanners, unidentifiable user-agents on 404s, and paths that resolve via PagePathHistory redirects are now filtered out of all three entry points (server-side, JS pageview, JS event). HTTP libraries treated as bots — curl, wget, python-requests, GuzzleHttp, RSS readers etc. are now correctly filtered. Expanded regex fallback with 2025/2026 AI scrapers (GPTBot, ClaudeBot, PerplexityBot, ByteSpider, Meta-ExternalAgent and ~25 more). GitHub-based update check for the bundled device-detector — settings page shows when a newer release is available with manual update instructions (24h cached, no auto-downloads for safety). Module settings page reorganized into 7 collapsible fieldsets with two-column layouts. Upgrade is safe — your tracking mode stays at "Both" unless you change it. After upgrade I recommend switching to "JavaScript first" in module settings and watching the difference. A note on the bundled matomo/device-detector — the version shipped with this module (6.4.2) is intentionally a bit behind the latest release. Bundled libraries are meant as a fallback to keep the module self-contained; pinning a known-tested version avoids surprises when matomo introduces new detection rules or minor API changes between releases. The module checks GitHub for newer matomo releases in the background and shows an "Update available" notice in the settings when one is out. If you want the latest detection rules, the recommended path is to install matomo/device-detector site-wide via Composer (composer require matomo/device-detector) — the module will automatically prefer the Composer copy over the bundled one. Alternatively, you can replace the contents of /site/modules/NativeAnalytics/lib/matomo-device-detector/ with a newer release from the matomo GitHub page (instructions are in the module settings under the version notice). The release has been tested, but with this many changes some edge cases may still slip through. If you notice anything off, please let me know so we can fix it. 😉 Cheers R 1 3
adrian Posted 17 hours ago Posted 17 hours ago Thanks so much @Roych - lots of great improvements there (I especially love the approach to loading matomo/device-detector - I went with installing via composer to keep it up to date. Unfortunately I was still getting a lot of bot traffic getting through. I've submitted a PR to deal with it: https://github.com/Roychgod/NativeAnalytics/pull/4 - it's made a huge difference to what I am seeing. Thanks again.
adrian Posted 17 hours ago Posted 17 hours ago Sorry, I little more following up: 1) I see the option for disabling the page edit stats widget, but I'd like to see it on pages that are viewable (and will actually have stats). I only want to disable it on pages that aren't viewable - PW often has a lot of these - think about the pages that drive a Page Reference field. 2) For anyone using Ryan's UserActivity module, you'll notice a conflict between them - which can result in the Analytics page not loading with these errors: - ERR_CONNECTION_CLOSED on the document itself (DevTools shows 200 (OK) paradoxically) - ERR_HTTP2_PROTOCOL_ERROR cascading across all sub-resources sharing the H2 connection - Sometimes a 414 Request-URI Too Large flash Here is my report to Ryan in case you want to fix it yourself now: Spoiler On any admin page that renders inline SVG with accessible <title> elements (in my case the NativeAnalytics dashboard, which has SVG line charts with one <title> per data point), reloading the page intermittently produces: - ERR_CONNECTION_CLOSED on the document itself (DevTools shows 200 (OK) paradoxically) - ERR_HTTP2_PROTOCOL_ERROR cascading across all sub-resources sharing the H2 connection - Sometimes a 414 Request-URI Too Large flash Disabling HTTP/2 in Chrome (--disable-http2) exposes the underlying cause: the UserActivity activity-tracking ping URL is hundreds of KB long and gets rejected by Apache/CloudFront. The 414 cascades as H2 protocol errors when running over H2. Root cause UserActivityAdmin.js line 145 (in version 9): '&title=' + encodeURIComponent(jQuery('title').text()) + jQuery('title').text() matches every <title> element on the page, including those nested inside SVG graphics. jQuery concatenates their text contents into one string. On a page with many inline SVG <title> elements (e.g., chart tooltips for accessibility), the resulting title= parameter becomes hundreds of KB, blowing past Apache's LimitRequestLine (default 8190 bytes) and CloudFront's request-line cap. Reproduction Any admin template that renders inline SVG with <title> children will reproduce this. Minimal test: <svg width="100" height="100"> <circle cx="50" cy="50" r="20"> <title>Some hover label that will end up in the activity ping</title> </circle> </svg> With enough such elements on a page, the activity ping URL exceeds request-line limits and starts failing intermittently (timing-dependent because UserActivity only fires the ping at intervals or on certain events). Suggested fix One-character change — replace the unscoped jQuery selector with document.title: - '&title=' + encodeURIComponent(jQuery('title').text()) + + '&title=' + encodeURIComponent(document.title) + document.title is the standard DOM property for the page title — it only reflects the <head><title> content and ignores SVG <title> elements anywhere else in the document. Semantics are unchanged for the common case (head title), and the SVG title pollution disappears. Alternative if you'd prefer to keep the jQuery style: $('head > title').text() is equally specific. Why this matters more in 2026 Inline SVG with <title> for accessibility is now standard practice (recommended by WCAG and used heavily by chart libraries, ProcessWire dashboard modules like NativeAnalytics, etc.). Any PW site combining UserActivity with such a module will hit this. Took me several hours of misdiagnosis (assumed it was a CloudFront/mod_http2 framing bug) before tracing back to the activity ping URL length.
Roych Posted 7 hours ago Author Posted 7 hours ago Thanks - all sorted in 1.0.25. @adrian — merged your PR. The preg_match regex fix and the new behavioral bot filter are both in. Great approach catching the headless scrapers that slip past UA detection. Appreciate the clean diagnosis and patch. 😉 On the follow-ups: The page edit stats widget now only renders on viewable pages ($page->viewable()), so non-viewable pages like Page Reference sources no longer show it. The global disable option still works on top of that. UserActivity conflict fixed on my side - NativeAnalytics no longer emits a <title> per data point on the SVG charts, so it can't blow past the request-line limit anymore. Your diagnosis was spot on though; the real fix belongs in UserActivity (document.title instead of jQuery('title').text()) — worth getting that one-liner to Ryan. Cheers R 1
adrian Posted 35 minutes ago Posted 35 minutes ago Thanks @Roych - much appreciated. I have another couple of PRs for you: - fix tab querystring so now you can reload or share the URL to a tab - added another bot filter to better exclude bots probing for URLs that don't exist, with an option to still display them in the 404 panel if you want (for attack path visibility). 1
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now