Jump to content

netcarver

PW-Moderators
  • Posts

    2,172
  • Joined

  • Last visited

  • Days Won

    44

Everything posted by netcarver

  1. @szabesz @AndZyk Drill down to per-user activity is now behind basic auth, and the main page table is shorter now.
  2. Hi @bernhard ReactPHP has been very solid for long running server processes in my experience. There's also Framework-X which is a nice layer on top, though I haven't used it much. And FrankenPHP - which I have never used, though it might allow long-running async stuff as well. Sorry to make you feel tracked, I guess things like this that simply snapshot publicly available ephemeral info and turn it into a timeline are somewhat borderline. If people are worried about this I'm happy to take this service down, or put it behind basic auth so only the forum mods have access. Basically, I want to get to the point where I can detect Join Events followed by new topic creation prior to it being published in order to get an early warning of possible spammers. This tool might be able to do that (time will tell.)
  3. And the moment I write it up - I get an error on the index page. Oh well, stress tests are useful. :)
  4. I've added a new sub-project to my PWGeeks site that turns the ephemeral activities shown in the PW forum's Online Users & All Activity pages into a live-updating, unified, timeline of events - both what users are replying to - and what they are viewing. Whilst it is not built with Processwire, it is related and was an interesting little project to build. I thought I'd post about it here in case it's of use to anyone else. You can find it here: https://activity.pwgeeks.com/ Clicking on a username drills down into their activity, but more on that later. Unlike the usual All Activity view on the forum, this integrates consumption activities (viewing stuff) with production activities (posting/replying.) Unfortunately, due to a limitation on what the forum lets non-logged in readers see, reactions to posts cannot be tracked at this time. I initially thought this might be useful just to make the ephemeral viewing activity more obvious, but I'm now hoping that it can be turned into a tool to more easily help forum moderators deal with spammers (new joiners who quickly start posting). But that's yet to be proven. If you don't wish your read/write activities to be traceable, you can login to the forum in anonymous mode. Architecture The architecture is split into a long-running ReactPHP Watcher Service, and the Index Generator code which creates the page you just viewed with the help of Caddy, PHP8.2 and PHP-FPM. It all runs on a cheap Contabo VPS. Both Caddy and the watcher process are defined as systemd services that are automatically restarted if they fail, or the box is rebooted. Pusher is used to provide a Pub-Sub channel for immediate communication of changes found by the watcher service to anyone who is viewing the index pages - allowing the index to be updated as forum activity is detected by the watcher. Pusher takes care of any fan-out needed between the Watcher Service publishing the events, and any browsers subscribed, via Pusher-JS. I used Pusher's free tier and created an "application" to get my channel and the needed credentials, which went in a .env file. I also turned on subscription counting on the channel within Pusher's dashboard to allow simple console logging of the number of clients connected to the channel at any one time. All detected activity is also stored in a local SQLite DB which allows the Index Generator to build the initial table of activity shown in your browser. Once the page is loaded, JS events take over and continue populating the table in (almost) real time as they come in from Pusher. The Watcher Service ReactPHP is used to create a long running server process, in PHP, that has run for more than 2 months at a time with rock-solid (no, that's not a bernhard module) memory use at 10MB once all the user data is loaded from the DB. I am sure this process would have run indefinitely, but I recently restarted the service to add detection of write-activities on the forum. The major Composer packages used are pusher/pusher-php-server (to publish to my app's event channel) vlucas/phpdotenv (to read the pusher credentials from an .env file) react/event-loop (to run the PHP app indefinitely) fabprot/goutte (for scraping the 2 forum pages) symfony/console (for CLI output formatting and logging) I would probably choose a different scraping library if I were to do this again, but goutte works just fine for now. The watcher only uses two public forum pages; the Online Users page, with the logged in filter applied, and the All Activity page. It is worth noting that the service has no log-in details, so sees these pages as a logged-out visitor to the forums would - which means significantly less information is available to it than to you if you view those pages when you are logged in to the forum. All Activities Page Differences When logged in, the All Activities page shows user reactions to posts (2 below). These are not available to guests, so cannot be tracked by the Watcher Service. If you have purchased Pro modules and have access to some of the VIP forums, then new posts or replies to posts in those forums are also shown (1) Guests have no access to either of these - so VIP forum activities are not tracked. The Watcher Service does not have any login credentials, so this is the view it sees... All activities listed on this page have a UTC timestamp in their HTML attributes that can be used to record the actual time of Joins, New topics and replies being posted. I don't bother recording any "user started following..." activities. Online Users (logged in) You might not be aware of this page on the forum, but it's how the Watcher Service can tell who's viewing forums & posts, creating new topics, or using the personal messaging service. If you have never visited this page on the forum, you want (1) browse, 2(Online Users) and then use the Filter-By dropdown to view logged-in users. The activities listed (3) do not have a timestamp in the HTML - so the watcher limits itself to anything that happened "Just Now" that has not yet been recorded for that user and uses the server's time() to record the event as having occurred. When users view a VIP forum or post, or visit the All Activity page, the user list page does not show what the user is viewing - it just shows their activity as a blank string (See netcarver's activity in the above screenshot.) Other Limitations The main loop of the service runs several times a minute and scrapes and de-duplicates activities from the forum. Any activity that happens on the forum between these samples are undetectable. So, if you visit a forum and then quickly click into a topic, and then back out, your activity will not be traceable. The Event Loop Using ReactPHP is conceptually quite simple, but there are a few things to keep note of. Here's the basics of the Watcher Service... <?php declare(strict_types=1); namespace Netcarver\ForumActivityMonitor; require_once 'vendor/autoload.php'; use React\EventLoop\Factory; ... use Pusher\Pusher; use Dotenv\Dotenv; require_once __DIR__ . '/.format.php'; // output formatting helpers require_once __DIR__ . '/.storage.php'; // storage class $dotenv = Dotenv::createImmutable(__DIR__); $dotenv->load(); // Create pusher publication connection $pusher = new Pusher( $_ENV['PUSHER_KEY'], ... ); $sample_period_seconds = 20; $started_ts = time(); $output = new ConsoleOutput(); $output->write("\n>>> ProcessWire Forum Activity Monitor (sampling every $sample_period_seconds seconds) <<<\n\n"); // Open the local SQLite DB for storage layer... $db = new \PDO('sqlite:path/to/database.sqlite'); $storage = new Storage($db); $loop = Factory::create(); $loop->addPeriodicTimer(1, function () use (&$storage, $started_ts, $sample_period_seconds, &$pusher, &$output) { $now = time(); $can_access_pw_forum = ($now % $sample_period_seconds === 0); $elapsed_time_seconds = $now - $started_ts; showStatusLine($output, $can_access_pw_forum, $storage->userCount(), $elapsed_time_seconds); if ($can_access_pw_forum) { $client = new Client(); try { // Scrape, dedupe & store events ... // publish events via pusher... if (!empty($event_timeline)) { ksort($event_timeline); $table = new Table($output); $table->setHeaders(['#', 'Time', 'UID', 'Username', 'Activity']); $mem_use = memory_get_usage(true); $runtime_str = formatElapsedTime($elapsed_time_seconds); $pusher_events = []; foreach ($event_timeline as $events_at_time) { foreach ($events_at_time as $event) { $uid = $event['uid']; $user_activity_count = $storage->getUserActivityCount($uid); $table->addRow([$user_activity_count, $event['time'], $uid, $event['user'], $event['activity']]); $pusher_events[$event['time']][] = [ 'uid' => $uid, 'url' => $event['url'], 'user' => $event['user'], 'act' => $event['activity'], 'mem' => $mem_use, 'uptime' => $runtime_str, 'type' => $event['type'], ]; } } $pusher->trigger('activities', 'update', $pusher_events); $table->render(); $output->write("\n"); } } catch (\Throwable $e) { $output->write("\nCaught Throwable: ".$e->getMessage()."\n\n"); } } }); $loop->run(); Note that Pusher, the storage instance, and the console are all passed into the loop closure by reference so state can be maintained between each scheduled call to the loop function. The loop closure uses a try {} catch (\Throwable) {} block to ensure it keeps running without systemd having to restart it in case of a PHP error. The catch block does occasionally run - so far if DNS resolution fails when scraping the forum. I've omitted the scraper and de-dupe code from the above as they are still a work in progress, but they populate an $event_timeline array if anything new is detected. The unified array of events (if any) is published via Pusher and each entry includes information about the Service uptime and memory usage. The Index page simply console logs this meta-data from the first event in the array of activities it receives, so you can use your browser's console to track these (along with the number of subscribers to the channel)... I made the event loop run every second, even if it's not time to sample the forum, so the CLI output can be updated regularly. This was especially useful when initially running the Watcher from the command line and I could probably drop the status updates now things are run via Systemd. Running from the CLI on the server, or tailing the log file, gives a nicely formatted table of events as they occur thanks to Symfony's console library and table helper. Systemd Integration To allow automatic restart of the watcher when the VPS is restarted, I added this service definition file to /etc/systemd/system/forumwatch.service that runs the watcher as an unprivilaged user... [Unit] Description=ReactPHP Processwire Forum Watcher [Service] ExecStart=/usr/bin/php8.2 /home/pwfw/activity.pwgeeks.com/watcher/react.php WorkingDirectory=/home/pwfw/activity.pwgeeks.com/watcher/ StandardOutput=file:/var/log/pwforumwatch.log StandardError=file:/var/log/pwforumwatch.log Restart=always User=pwfw Group=pwfw [Install] WantedBy=multi-user.target A quick sudo sytemctl enable forumwatch.service && sudo systemctl start forumwatch.service is then all that's needed to get things running. As the output is logged to /var/log/pwforumwatch.log, I also gave it a logrotate.conf file to keep things under control. The Index Generator This is a single index.php file that takes care of reading the most recent user activities from the SQLite DB and generating the table of events from it for that point in time. JS is included (pusher-js) that subscribes to the application's activity channel. Pusher's free tier allows up to 100 simultaneous connections to the event activity channel, and you can see how many users are connected via your browser's console. NB: The following feature is now behind a basic auth login. If you click on a user's name, you'll reload the page with a filter that lists only that user's activities. Here are Bernhard's as he has posted recently as I write this up. Click on the User's name or avatar (1) to be taken to their page on the forum, or click on the "Everyone" or PW logo (2) to be taken back to the all-inclusive index. Trying It Out I recommend opening the activity tracker in one browser on one side of your screen, then opening the Forum (and logging in) in another browser on the other side. As you (and anyone else) visits pages on the forum, you should see things update in the tracker. Also open up the console on the tracker page to see uptime/memory and viewing user count data as they come in. If you read this far, thank you for your time, and I hope this was of some interest or use to you.
  5. Nice site and write-up - thanks, @bernhard!
  6. Another option would be a 1 minute cron job on a VPS. It could call a script (bash or PHP maybe) that curls through to your API provider and send yourself an email/sms if it fails. Would give you more flexibility to test POST requests.
  7. Hello @markus_blue_tomato If it's only GET requests you need to monitor, you could use Uptime Robot's free tier - it issues a request to the target address every 5 minutes and emails you if it is down. I've used it for years and it has been very reliable.
  8. Is that useragent string a constant across all requests, or just the "amazonbot" part? Either way, if you have access to WireRequestBlocker from the ProDevTools package, you could target the useragent with a block rule. If you don't have access to WireRequestBlocker, then an .htaccess rule could be used to reject those requests. There are some examples here.
  9. Just to add a caveat to blocking vendor/ - this sometimes leads to unservable asset issues (such as JS/fonts/images/styles) if the composer package includes such things. YMMV with this approach, but the inspector in the browser can help tracking these down.
  10. Hi @HakeshDigital Does it work if you remove, or comment out, the indicated line from your .htaccess file... RewriteCond %{REQUEST_URI} ^(.*)/([a-z]{2})$ RewriteRule ^(.*/)(es|en|it|pt)$ $1$2/ [R=301,L] RewriteRule ^(.*/)es$ ^(.*/)es/ [R=301] ## <<< Remove this line, or comment out with # at the start. RewriteCond %{REQUEST_FILENAME} !-f RewriteRule (.+(?:es|en|it|pt))$ /$1/ [L,R=301] RewriteCond %{REQUEST_URI} ^/([a-z]{2})/ RewriteRule ^ - [E=LANG:%1] RewriteCond %{ENV:LANG} !^$ RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_URI} !(.*)/$ RewriteRule ^(.*)$ /$1/ [L,R=301] ...?
  11. $d = strtotime('next monday +7 days'); echo date('Y-m-d', $d); Seems to work for me.
  12. Thank you for this @TomPich always good to see people giving back to the community, much appreciated!
  13. I've not tried this one, but perhaps @bernhard's RockSkinUiKit module can do it?
  14. Interesting concept. I'm with Bernhard when he suggested making the help text fixed in the middle. Half of the time the instructions are upside down (being in the bottom half of the screen) and moving - so hard to read unless you wait for it to make it back up. Could also do with some more contrast for those of us with slightly compromised sight. Thanks for sharing though.
  15. @cwsoft How's this? https://github.com/processwire/processwire/blob/3cc76cc886a49313b4bfb9a1a904bd88d11b7cb7/wire/core/Password.php#L115
  16. I'm not sure of the context of your call to truncate(), but perhaps you could run the sanitizer on the output of strip_tags()?
  17. Would there be any merit in submitting your changes as a PR against the original, maybe with some suitable config settings to allow control of target templates/roles etc?
  18. Hi @Greg Lumley That's correct - you need to use "db" as the DBHost in the site/config.php file for ddev because MySQL/Maria is running in a separate container called "db" in the ddev stack. Both the webserver container and database container are attached to a dedicated virtual network when you bring up your stack. Docker's DNS resolver automatically makes the internal network IP of the database server available to other containers on the same virtual network using the container name - hence the need for "db" in the config file. In docker stacks where both the webserver and MySQL/Maria DB were running in the same container then you could use "127.0.0.1" or "localhost" as the DBHost - but that's not the case for ddev.
  19. @SIERRA How are you processing the admin page submission in order to populate your SMS template and send it? Presumably you are using hooks, right? Take a look at AdminCustomFiles for inserting custom JS into admin pages.
  20. @hintraeger Have you taken a look at the pagefileSecure config option? Might not be suiteable for you, but for file storage outside the site root, how about @Wanze's FieldtypeSecureFile module? I also remember there being a very old post from @Soma here in the forum about this topic.
  21. Just change this line... function visit(Page $parent, $enter, $exit=null) To this... function visit($parent, $enter, $exit=null)
  22. Does this work... echo $page->lorum_ipsum->getLabel(); ? Also, take a look at Bernhard's post here...
  23. Hi @kathep Thanks for the detailed report. I've not used Yunohost, so I'd want to find out if Yunohost is simply using nginx as a proxy to your processwire app stack, and if the app itself is being served by apache2?
×
×
  • Create New...