Jump to content

FireWire

Members
  • Posts

    168
  • Joined

  • Last visited

  • Days Won

    3

FireWire last won the day on November 30 2021

FireWire had the most liked content!

About FireWire

  • Birthday January 1

Contact Methods

  • Website URL
    https://www.skylundy.io

Profile Information

  • Location
    California
  • Interests
    Writing code. Writing more code. Refactoring. Writing code.

Recent Profile Visitors

2,291 profile views

FireWire's Achievements

Sr. Member

Sr. Member (5/6)

171

Reputation

  1. The files you work with will always be on your local machine. And I see no foolishness here :) Working inside the container itself (using the shell script in the Devilbox directory) shouldn't be needed so no worries there. Some recommendations (and me thinking out loud)- If you're getting a 500 error with a page title before failure then PHP is booting properly. I would start with ProcessWire's logs first in case it's able to catch that error when it executes. Could be more info there. Double check that $config->debug is set to true in config.php to see if you can squeeze out any more information from your 500. Check your .env config to make sure you're running either PHP 7.4 or 8.0 (depending on your PW version) just in case Devilbox's defaults to PHP 8.1 which PW isn't fully compatible with yet AFAIK. I think it might do that IIRC. Knock out those items and we can troubleshoot from there if it doesn't fix the problem.
  2. Nice! Very awesome seeing the concept worked on and a great contribution to the ecosystem. Working with time bends my brain, you have my respect and admiration haha. Looking forward to seeing your work. Haven't used Alpine yet personally but it looks fantastic.
  3. A couple months ago I reached out to the original author of Recurme and asked about the possibility of open sourcing the module and taking a role as maintainer. I've used this module previously and think a good calendar module is critical for the PW ecosystem. Had a great exchange with him and he was open and willing to turn over any assets needed and allow the code to be open sourced with no restrictions. As far as I know the module is still largely usable with the tweaks people have mentioned in this thread but I can't speak to more than that. I started working on refactoring the code and have a few ideas on how to improve/revamp it. I'm still enthusiastic about the idea of keeping development alive but unfortunately I just don't have the time right now to take it on with how busy I am with non-programming stuff as well as my commitment to getting the next version of my Fluency module released. I'd love to see this open sourced and would count myself as a contributor or help out where I can, but being a maintainer/lead is outside of my abilities right now. Hope there is some interest or capacity out there to make this happen.
  4. What the hell are "outdoors" and "offline time"? I've never heard of this.
  5. Back with more! Prepare for incoming wall of text... I mentioned adding custom directives to our .htaccess file and wanted to share some more detail on that as well as some other tips. I was reviewing our 404s as a matter of maintenance so to speak to ensure that we had redirects in place as necessary. While reviewing that I found a lot (a lot) of hits that were bogus, clearly bots and even web crawlers for engines we have no interest in being listed for. What I found was in just 48 hours we had 700 total 404s and I imagine on some websites that number could be higher. By analyzing that log and writing custom directives I was able to take 700 404s logged by ProcessWire down to 200 which are "legitimate" in that it's traffic that to be redirected to a proper destination page. I'm sharing my additional directives here as an example. Again, ANY bot/security directives should be at the very top of your .htaccess file. As always, test test test, and modify for your use case. # Declare this at the top of your .htaccess file and remove or comment out all other instances of this directive elsewhere RewriteEngine On # Block known bad URLs # Directories including sub-directories RedirectMatch 404 "\/(wp-includes|wp-admin|wp-content|wordpress|wp|xxxss|cms|ALFA_DATA|functionRouter|rss|feed|feeds|TKVNP|QXXLZ|data\/admin)" # Top level directories only - There are no assets served from these directories in root, only from /site/assets & /site/templates RedirectMatch 404 "^/(js|scripts|css|styles|img|images|e|video|media|shwtv|assets|files|123|tvshowbiz)\/" # Explicit file matching RedirectMatch 404 "(1index|s_e|s_ne|media-admin|xmlrpc|trafficbot|FileZilla|app-ads|beence|defau1t|legion|system_log|olux|doc)\.(php|xml|life|txt)$" # Additional filetypes & extensions RedirectMatch 404 "(\.bak|inc\.)" # Additional User Agent blocking not present in 7G Firewall <IfModule mod_rewrite.c> # Chinese crawlers that cause significant traffic to bad URLs RewriteCond %{HTTP_USER_AGENT} Mb2345Browser|LieBaoFast|zh-CN|MicroMessenger|zh_CN|Kinza|Datanyze|serpstatbot|spaziodati|OPPO\sA33|AspiegelBot|aspiegel|PetalBot [NC] RewriteRule .* - [F,L] </IfModule> Details on this additional config: It blocks some WP requests that get past 7G My added directives redirect to a 404 which tells the bot that it flat out doesn't exist rather than 403 forbidden which could indicate it may exist. I read somewhere that this is more likely to get cached as a URL not to be revisited (wish I could remember the source, it's not a major issue). Blocks a lot of very specific URLs/files we were seeing Blocks Chinese search engine bots, because we don't operate in China. These amounted to a lot of traffic. Blocks common dev files like .bak and .inc.* which aren't protected by default. Obvs you want to eliminate .bak altogether in production, but added safety fallback. I have not seen this cause any issues in the Admin. Also consider if directives could cause problems in another language. Customize by reviewing your logs Additional measures 7G and the directives I created are a healthy amount of prevention of malicious traffic. Another resource I use is a Bad Bot gist that blocks numerous crawlers that add traffic to your site but may or may not generate 400s-500s HTTP statuses. This expands on 7G's basic list. Bad Bot recommendations: Comment out: SetEnvIfNoCase User-Agent "^AdsBot-Google.*" bad_bot There's not really a good reason to block a specific Google bot If you make Curl requests to your server then comment this line out: SetEnvIfNoCase User-Agent "^Curl.*" bad_bot Reason: this will block all Curl requests to your server, including those by your own code. Be sure that you don't need Curl available if leaving this active. This is included in the list to prevent some types of website scrapers. If you want to leave this active and still need to use Curl, then consider changing your User Agent. Comment out: SetEnvIfNoCase User-Agent "^Mediapartners-Google.*" bad_bot Again, not necessary to block Google's bots, might even be a bad idea for SEO or exposure (only they know, right?). Testing There's no such thing as too much testing. These directives are powerful and while written well, may have edge cases (like 'null' mentioned previously). There's no replacement for manual testing, specifically it would be a good idea to test any marketing UTMs or URLs with GET strings you may have out there just in case. For automated testing I use broken-link-checker which can be called from the terminal or as a JS module. I prefer this method to using some random site scanning service. This will detect both 404s and 403s by scanning every link on your page and getting a response which is useful for ensuring that your existing URLs have not been affected by your .htaccess directives. broken-link-checker recommendations: Consider rate limiting your requests using the --requests flag to set the number of concurrent requests. If you don't you could run into rate limits that your managed hosting company, CDN, or you (if you're like me) have built into your own server. This terminal app runs fast so if you have a lot of links or pages those requests can stack up quickly. Consider using the -e flag, at least initially while testing your directives. This excludes external URLs which will help your test complete faster and prevent any false positives if you have broken external links (which you can handle separately). Consider using the -g flag which switches the request to GET which is what browsers do. Shortcut, just copy and paste my command: blc https://www.yoursite.com -roegv --requests 5 If you have access to your Apache access log via a bash/terminal instance then you may consider watching that file for new 404/403 entries for a little bit. You can do this by navigating to the directory with your access log and executing the following command (switch out the name of your log as needed): tail apache.access.log -f | grep "404 " You may consider also checking for 403s by changing out the HTTP status in that command. "This seems excessive" I think this is good for every site and once you get it dialed in to your needs can be replicated to others. There's no downside to increasing the security and performance of your hosting server. Consider that any undesirable traffic you block frees up resources for good traffic, and of course reduces your attack surface. If you need to think about scalability then this becomes even more important. The company I work for is looking to expand into 2 additional regions and I'd prefer my server was ready for it! If you get into high traffic circumstances then blocking this traffic may prevent you from needing to "throw money at the problem" by upgrading server specs if your server is running slower. Outside of that, it's just cool knowing that you have a deeper understanding of how this works and knowing you've expanded your developer expertise further. This isn't meant to be an exhaustive guide but I hope I've helped some people get some extra knowledge and save everyone a few hours on Google looking this up. If I've missed anything or presented inaccurate/incomplete information please let me know and I will update this comment to make it better.
  6. So @horst is correct about ProCache. ProCache renders HTML files to disk and uses some pretty clever directives in .htaccess file to detect a URL that has a corresponding page on-disk. If one exists then that HTML file is returned and the PHP interpreter never boots. We're currently running 7G and ProCache on the current site at our company. Unless you are configuring specific caching directives for HTML files in your .htaccess then you shouldn't see any problems. If you plan to use 7G or any bot blocking, it should be at the very top of your .htaccess so that it's directives are parsed before the rest that are there to serve legitimate traffic. The sooner the bots/malicious requests are deflected, the less impact on your server. While I've used 7G on many websites in production with no problems, it's important to test as noted by @nbcommunication in the comment above. He mentioned URLs with "null", while I've never seen any issues pop up with that since "null" isn't too common in English, I still keep the directive unchanged because that is there to detect malicious GET strings and similar undesirable things. 7G blocks a good deal of WP related requests however I wrote some additional directives that addressed requests that weren't caught. I'll share some of the additional customizations I've made that further filtered out traffic based on our 404s and web crawlers that we don't care for. If you're willing to get your hands dirty and write some custom htaccess directives you can dial it in even more.
  7. Dangit. Sorry to hear that. The inline CKEditor thing is on my list but won't be added until the next version which I'm working on right now. The way that field renders is much different from other fields. I'll be sure to add that to the priority list to see what can be done.
  8. I haven't experienced that. I don't often use tables and can't remember if I've seen that. Fluency wouldn't be able to change how a field is rendered.
  9. Translating only for missing content should be good for performance. I do think that there are some items to consider. If this is used in a template when the page loads and caching is used then this function isn't guaranteed to execute since a pre-rendered HTML document would be returned to the browser. Caching is a good idea for performance so this would create a situation where you can't use caching and that would be a performance hit. If the content has been changed but you're only checking for the existence of translated text then it wouldn't re-translate. I have a solution for this but it would take a little extra code (detailed below) There is a possibility that using the function you wrote could mean additional calls to the database. Someone with a little more knowledge of the ProcessWire core could correct me if I'm wrong. If that's the case then there could be a performance hit that is dependent on how performant your database setup is. Performance difference would depend on how performant your DB and DB connection is. If you are running this loop on page load then the performance hit would really come from the delay in response from DeepL. Translation can take a couple of seconds in some cases and that will slow your page load time down a lot. This would only happen as long as something needs to be translated. If not then it will skip over the field and the page will load. As for the translation return value- the module returns a passthrough payload directly from the DeepL API (I didn't develop the return data structure). This is good because it is predictable and unchanged from DeepL documentation, and it makes sense when you consider the ability to translate multiple separate texts at once. Take some time to review the README.md file in the Fluency module directory, it has documentation of the return data structure and details on using the module directly. I'll make a note to let module users know that information is there for review. Solution for tracking changed content: One way to verify content is to use hashing and WireCache. I am developing the next version of Fluency that will have a solution for this but in the meantime here is a modified solution that may fit your use case: <?php /** * Analyzes the content in a field determines if it has changed since the last time it was checked * @param Page $thisPage Page containing field to check * @param string $fieldName Field object to check content for * @param string $lang Name of language to check content for default is PW's default language * @return bool|null Bool for content change, null if field doesn't exist on page */ function fieldChangedSinceCheck(Page $thisPage, string $fieldName, string $lang = 'default'): ?bool { if (!$thisPage->hasField($fieldName)) { return null; } $thisPage->of(false); $pageField = $thisPage->fields($fieldName); $isMultilanguageField = $pageField->type instanceof FieldtypeLanguageInterface; // Handle multi-language field if ($isMultilanguageField) { // Get the language, content in that language, and create a unique tracking ID $language = wire('languages')->get($lang); $current_field_content = $thisPage->$pageField->getLanguageValue($language); $key = "{$thisPage->id}|{$pageField->id}|{$language->id}"; } // Handle non multi-language data if (!$isMultilanguageField) { $current_field_content = $thisPage->$fieldName; $key = "{$thisPage->id}|{$pageField->id}"; } // Create a unique hash for the current content in this field in this language on this page $current_field_content_hash = hash_hmac('sha256', $current_field_content, $key); // Search WireCache for a previously stored content hash under this key if it exists, otherwise null $cached_field_content_hash = wire('cache')->getFor('field_content_tracking', $key); $contentHasChanged = false; // Compare the hash we created for the current content and compare it with the hash previously stored // If they do not match either it does not exist, or the content has been changed since it was last // analyzed if ($current_field_content_hash !== $cached_field_content_hash) { // Store the current content hash which will be used later to compare if content has changed wire('cache')->saveFor('field_content_tracking', $key, $current_field_content_hash); $contentHasChanged = true; } return $contentHasChanged; } There is a caveat, this solution will only tell you if it has updated since the last time it checked. So if a field was changed and you check the function will return true. If you check it again it will return false because it only tracks if it has changed since the last time it checked. I think this will still work for your use case as long as you act on it when it returns true. It works with multilanguage fields, regular fields, any language, and returns null if you try to check a value for a field that isn't on the page. Hope it helps!
  10. So I'm working on a new release and I just found this the other day myself. To fix this right away, change the following on line 172: <?php // From: wire('log')->save(self::ERROR_LOG, $message); // To: wire('log')->save(self::ERROR_LOG, $output['message']); This line means that there was a problem with DeepL. Either it had trouble connecting or DeepL returned an error when you tried to translate. In the ProcessWire admin take a look at the Fluency log, I think it may be named "deeplwire-api". It should contain the error to help you troubleshoot.
  11. PM me if I can share any info that would help get you up to speed. Can share some configs, code, or answer some questions if you need. I spent a lot (a lot) of time on tuning our setup and would be happy to share. Also, snapshot your "perfect setup" on DO before you host anything on it. I have one that is a template for our servers and I can spin one up in minutes. Also gives you some extra confidence with experimenting when you know you can nuke a server and start over with a machine built how you like it.
  12. I don't know why I wasn't thinking about NPM security issues... that was a dumb on my part haha.
  13. That's a pretty great strategy. I've thought about moving builds to the server, my approach will probably be updating the hook below to run a Gulp build script automatically. Question about your pre-push hook, does that make it possible to accidentally overwrite production code when the local branch is behind master? Asking since I haven't used a pre-push for deployment and I'm wondering if the files are being copied to the server before your local repo finds out that it could be behind the remote on Github. I'm going to describe our full setup for clarity because we don't use managed servers and that requires a bit more configuration. I included some details at the end to use this with managed hosting which is easier. On our servers there is a Linux user called 'deployment' which contains bare Git repositories for each site in '/home/deployment/sites' with this post-receive hook. #!/bin/bash while read oldrev newrev ref do if [[ $ref =~ .*/main$ ]]; then echo "Main ref received. Deploying to production..." sudo git --work-tree=/path/to/hosting/directory --git-dir=/path/to/deployment/repo checkout -f # This shouldn't be required on managed hosting setsid sudo chown -R www-data:www-data /path/to/hosting/directory > /dev/null 2>&1 < /dev/null & setsid find /path/to/hosting/directory -type d ! -perm 755 -exec sudo chmod 755 {} \; >/dev/null 2>&1 < /dev/null & setsid find /path/to/hosting/directory -type f ! -perm 644 -exec sudo chmod 644 {} \; >/dev/null 2>&1 < /dev/null & else echo "Ref $ref received. Not deploying production: only the main branch may be deployed on this server." fi done Locally in we have an additional remote called 'production'. We also use this for a deployment to staging where the remote only accepts pushes from the 'development' branch. So using 'git push' pushes to our Gitlab repo, and 'git push production' sends code live. production deployment@website.com:sites/website.com.deploy.git (push) Thinking about it now it would be a good idea to write a bash script that pushes to production only when the push to Gitlab is successful to further ensure all main branches match (writing myself a todo for this). Things I like about this approach: Only files that have changed are copied to the public directory which is fast and efficient PW core, modules, and extensive application code we have in /site outside of the templates directory are included. Things like PW logs and translation files are excluded via .gitignore. Config values are kept in a .env file so 'config.php' still lives in the repo and changes can be pushed. It is not possible for anyone to overwrite work that was pushed because the local branch will be behind the production branch. Server login passwords are disabled at the OS level so SSH keys are used. Pushes require no password. I wrote an interactive bash script on the server to add new sites which automatically creates hosting directories, Apache virtual host file, and deployment repository all from pre-written templates. Keeps the setup predictable, error free, easy to use consistently with very little work. When I complete the testing suite I'm going to add a pre-push hook locally and modify the post-receive hook to execute tests and require that all pass before deploying. Eventually I'll be putting all of this on a CI/CD pipeline but for now this smaller scale approach is just fine. I don't have the time to revamp our deployment strategy at the moment haha. Differences in hosting environments- For un-managed hosting the lines that begin with 'setsid' are required to change ownership to the Apache user and set file permissions in the hosting directory after copy. If you're managing the web server you probably already know what to do as far as user/permission management for 'deployment'. For managed hosting (I use Dreamhost for some projects) no user/permission configs are required so all the 'setsid' lines can be deleted. Only SSH access and Git on the managed hosting server are needed. Just create a sister directory to your website directory, initialize a bare repo with 'git init --bare', add the post-recieve hook with the proper directory locations, and remember to 'chmod +x' your post-receive hook file. This can probably be optimized more but I've been using it for years and it works ¯\_(ツ)_/¯
  14. I've been interested in sharing my setup since it's radically changed over the last year for the better. Wish I could open the repo for the code of my flagship project, but it's the site for the company I work for and isn't mine, www.renovaenergy.com Local Dev: Code editor is Sublime Text tuned for my preferences/workflow. OS is Ubuntu Linux, will probably distro-hop at some point like Linux users do. Environment is provided by Devilbox, which I can't recommend enough. It's a fast (mostly) pre-configured yet customizable Docker tool with outstanding documentation. A ProcessWire ready container is available. CSS/JS compiled by Gulp/Babel/Browserify for local dev and production builds. ES6 modules. Zero frameworks, no jQuery. Focus on lightweight JS and code splitting for better load times. CSS is compiled and split into separate files by media queries which are loaded by browsers on demand based on screen size. Currently building out website unit/integration tests using Codeception. This is becoming increasingly necessary as the site becomes more complex. Firefox Developer Edition Tilix terminal emulator, Quake mode is awesome Cacher stores code/scripts/configs in the cloud for easy sharing across machines. IDE integration is solid Meld for fast diffs WakaTime because who doesn't like programming metrics for yourself? DevDocs but locally in a Nativefier app. REQUEST: Star ProcessWire on Github. If a project has 7k+ stars it is a candidate to have it's documentation added to DevDocs. Production: Code editor is Vim on server Deployment is via Git. Local repositories have a secondary remote that pushes code to production via a bare GIT repo which updates assets on the server using hooks. Access to server via SSH only. Changes to files only made locally and pushed. Hosting by DigitalOcean with servers custom built from OS up for performance/security. Custom PageSpeed module implementation. Automatic image conversion to webp, file system asset caching, code inlining, delivery optimization, cache control, etc. Driven down TTFB <=500ms on most pages with load times around 2 seconds sometimes less if I'm lucky haha StatusCake monitors uptime, automated speed tests, server resources, and HTTPS cert checking. PagerDuty is integrated with StatusCake so issues like servers going down, server resources (ram/disk/memory) low, and whatever else get notifications on all your devices. 7G Firewall rules are added to the PW .htaccess file to block a ton of bots and malicious automated page visits. Highly recommended. Mailgun for transactional email ProcessWire Modules & Features: Modules (most used): CronjobDatabaseBackup, ProFields, Fluency, ImageBlurHash, MarkupSitemap, PageListShowPageId, ProDevTools, TracyDebugger, ListerPro, ProDrafts Template cache. We used ProCache initially but saw some redundancies/conflicts between it and PageSpeed tools on the server. Would absolutely recommend ProCache if your hosting environment isn't self-managed. All configurations are saved in .env files which are unique to local/staging/production environments with contents stored as secure notes in our password manager. This is achieved using the phpdotenv module loaded on boot in config.php where sensitive configurations and environment-dependent values are made securely available application-wide. Extensive use of ProcessWire image resizing and responsive srcset images in HTML for better performance across devices. URL Hooks - Use case- We rolled out a Web API so external platforms can make RESTful JSON requests to the site at dedicated endpoints. The syntax resembles application frameworks which made development really enjoyable and productive. The code is organized separately from the templates directory and allowed for clean separation of responsibilities without dummy pages or having to enable URL segments on the root page. Also allowed for easily building pure endpoints to receive form submissions. Page Classes - My usage -This was a gamechanger. Removing business logic from templates (only loops, variables, and if statements allowed) and using an OOP approach has been fantastic. Not sure if many people are using this but it's made the code much more DRY, predictable, and well organized. Implementing custom rendering methods in DefaultPage allowed for easily "componentizing" of common elements (video galleries, page previews, forms, etc) so that they are rendered from one source. Helped achieve no HTML in PHP and no PHP in HMTL (with the exceptions above). Also allows for using things like PHP Traits to share behavior between specific Page Classes. I completely fell in love all over again with PW over this and now I couldn't live without it. This literally restructured the entire site for the better. Probably other stuff but this post is too long anyway haha.
  15. That was something I considered after I posted that message. I use Docker for development and for some reason if I am connected to a VPN then DeepL fails to connect and it causes PW to not load admin pages. It is probably a networking configuration in the Docker image. This may be an issue that could exist with some dev environments, but that's just a semi-educated guess haha. Glad you are enjoying the module! Please let me know if you experience any issues. The next version is coming out soon and it will be going from alpha to beta version and have a bunch of new features. Any feedback is greatly appreciated!
×
×
  • Create New...