Robin S Posted March 7 Share Posted March 7 Verify Links Periodically verifies that external links are working and not leading to an error page. Requires the following core modules to be installed: PagePaths LazyCron How it works The module identifies links on a page when the page is saved and stores the URLs in a database table. For the purposes of this module a "link" is an external URL in any of the following... FieldtypeURL fields, and fields whose Fieldtype extends it (e.g. ProFields Verified URL) URL columns in a ProFields Table field URL subfields in a ProFields Combo field URL subfields in a ProFields Multiplier field ...and external href attributes from <a> tags in any of the following... Textarea fields where the "Content Type" is "Markup/HTML" (e.g. CKEditor and TinyMCE fields) CKEditor and TinyMCE columns in a ProFields Table field CKEditor and TinyMCE subfields in a ProFields Combo field The link URLs stored in the database table are then checked in batches via LazyCron and the response code for each URL is recorded. Configuration On the main module config screen you can define settings that determine the link verification rate. You can choose the frequency that the LazyCron task will execute and the number of links that are verified with each LazyCron execution. The description line in this section informs you approximately how often all links in the site will be verified based on the number of links currently detected and the settings you have chosen. The module verifies links using curl_multi_exec which is pretty fast in most cases so if your site has a lot of links you can experiment with increasing the number of links to verify during each LazyCron execution. You can also set the timeout for each link verification and customise the list of user agents if needed. In the Process module config there's a field allowing you to exclude URLs that start with a given string. This only applies to the "Error responses only" listing, and can be useful to avoid seeing false-positive error statuses for domains that you know provide inaccurate responses (more about this below). Usage Visit Setup > Verify Links to view a paginated table showing the status of the links that have been identified in your site. The table rows are colour-coded according to the response code: Potentially problematic response = red background Redirect response = orange background OK response = green background Link has not yet been checked = white background Where you see a 403 response code it's recommended to manually verify the link by clicking the URL to see if the page loads or not before treating it as a broken link. That's because some servers have anti-scraping firewalls that issue a 403 Forbidden response to requests from IP ranges that correspond to datacentres rather than to individual ISP customers and this will cause a "false positive" as a broken link. For each link the "Page" column contains a link to edit the page and the "View" column contains a link to view the page on the front-end. You can use the "Column visibility" dropdown to include a "Redirect" column in the table, which shows the redirect URL where this is available. You can use the "Custom Search Builder" to filter the table by particular column values, e.g. for a particular response code. To see only links that have an error response code (400 or higher, or 0), use the flyout menu to visit Setup > Verify Links > Error responses only. For those who can't wait The module identifies links as pages are saved and verifies links on a LazyCron schedule. If you've installed the module on an existing site and you don't want to wait for this process to happen organically you can use the ProcessWire API to save pages and verify links en masse. // Save all non-admin, non-trashed pages in the site // If your site has a very large number of pages you may need to split this into batches $items = $pages->find("has_parent!=2|7, template!=admin, include=all"); foreach($items as $item) { $item->of(false); $item->save(); } // Verify the given number of links from those that VerifyLinks has identified // Execute this repeatedly until there are no more white rows in the Verify Links table // You can try increasing $number_of_links if you like $vl = $modules->get('VerifyLinks'); $number_of_links = 20; $vl->verifyLinks($number_of_links); Advanced There are hookable methods but most users won't need to bother with these: VerifyLinks::allowForField($field, $page) - Allow link URLs to be extracted from this field on this page? VerifyLinks::isValidLink($url) - Is this a valid link URL to be saved by this module? VerifyLinks::extractHtmlLinks($html) - Extract an array of external link URLs from the supplied HTML string https://github.com/Toutouwai/VerifyLinks https://processwire.com/modules/verify-links/ 16 Link to comment Share on other sites More sharing options...
Reeno Posted September 2 Share Posted September 2 Hello, I have an issue installing the module, as therequired module "PagePaths" doesn't exist 😕 When installing, it says it needs the module "PagePaths": But apparently, there is no module named PagePaths: Is this module really needed? Link to comment Share on other sites More sharing options...
bernhard Posted September 2 Share Posted September 2 PagePaths is part of the core, just not installed. In your ProcessWire backend go to Modules > Core and install "Page Paths" module from the list. @Robin S I guess your module could install Page Paths automatically, no? 2 Link to comment Share on other sites More sharing options...
Robin S Posted September 2 Author Share Posted September 2 3 hours ago, bernhard said: I guess your module could install Page Paths automatically, no? I don't think it's good to automatically install modules that aren't bundled as part of the module repository. I believe PagePaths does impact some core functionality when installed (PageFinder results and possibly the $page->path/$page->url properties?) and the fact that it includes a feature to manually rebuild the paths table makes me think that it can potentially get out of sync with the page structure. So I think users need to consciously decide to install it if they want to use the Verify Links module. I've updated the readme to mention the required modules and that they are core modules. 2 Link to comment Share on other sites More sharing options...
Reeno Posted September 3 Share Posted September 3 Thanks! I was now able to install it. It seems it only checks textareas, which are not multilangual. I created two fields, one Textarea/CKEditor, the other TextareaLanguage/CKEditor. I added a template which just holds these two: Added one link each (link text is the same as the href attribute): But VerifyLinks only catches one of the links: Database says it's the one from the non-multilangual field: Link to comment Share on other sites More sharing options...
Robin S Posted September 3 Author Share Posted September 3 @Reeno, I've added multi-language support in v0.2.0. Please upgrade and let me know if you strike any problems. 1 Link to comment Share on other sites More sharing options...
Reeno Posted September 4 Share Posted September 4 Thanks for the quick response! I upgraded the module and it now works perfectly with my multilangual site. Thanks! 1 Link to comment Share on other sites More sharing options...
Robin S Posted September 18 Author Share Posted September 18 v0.3.0 released, which adds some new features such as Custom Search Builder in the datatable, a new "Error responses only" view, and the ability to exclude links from this view by defining URL prefixes, to exclude domains known to give false-positive error responses. See the updated readme for more. 2 Link to comment Share on other sites More sharing options...
mel47 Posted September 29 Share Posted September 29 Hi @Robin S I have a problem with your module. It detected links but the table is completely empty. Checking console, I got those errors : JQMIGRATE: Migrate is installed, version 1.4.1 jquery-migrate-quiet-1.4.1.min.js:2:552 La mise en page a été forcée avant le chargement complet de la page. Si les feuilles de style ne sont pas encore chargées, cela peut provoquer un flash de contenu non stylisé. JqueryCore.js:1:91295 Échec du chargement pour l’élément <script> dont la source est « https://XXX/site/modules/VerifyLinks/datatables/datatables.min.js?v=0.2.2 ». verify-links:93:102 Uncaught TypeError: $(...).DataTable is not a function <anonymous> https://XXX/site/modules/VerifyLinks/ProcessVerifyLinks.js?v=0.2.2-1727405204:7 jQuery 4 ProcessVerifyLinks.js:7:21 <anonyme> https://XXX/site/modules/VerifyLinks/ProcessVerifyLinks.js?v=0.2.2-1727405204:7 jQuery 4 Do I miss something to install? PW 3.0.241, VerifyLinks 0.2.2. Didn't work both on my computer (localhost) and online server. Mel 1 Link to comment Share on other sites More sharing options...
Robin S Posted September 29 Author Share Posted September 29 2 hours ago, mel47 said: VerifyLinks 0.2.2 Please upgrade the module to the latest version and let me know if there is still a problem. Link to comment Share on other sites More sharing options...
mel47 Posted September 29 Share Posted September 29 Hum... Is it possible to have the same module not at the same version ? When I go to upgrades, for the first one, it's said it's not tracked by module directory. For the second, it's the latest version. BTW, I downloaded this module just yesterday. Link to comment Share on other sites More sharing options...
Robin S Posted September 29 Author Share Posted September 29 @mel47, my fault, there are two bundled modules and I forgot to update the version number of one of them. Plus there was a problem where GitHub didn't pick up the casing change of a subdirectory. If you upgrade to the newly released v0.3.2 it should be fixed. 1 Link to comment Share on other sites More sharing options...
mel47 Posted September 29 Share Posted September 29 Thanks a lot! Works now! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now