Jump to content

teppo

PW-Moderators
  • Posts

    3,208
  • Joined

  • Last visited

  • Days Won

    107

Posts posted by teppo

  1. This is a beta release, so some extra caution is recommended. So far the module has been successfully tested on at least ProcessWire 2.7.2 and 3.0.18, but at least in theory it should work for 2.4/2.5 versions of ProcessWire too.
     
    GitHub repo: https://github.com/teppokoivula/ProcessLinkChecker (see README.md for more techy details, settings etc.)
     
    What you see is ...
     
    This is a module that adds back-end tools for tracking down broken links and unnecessary redirects. That's pretty much all there is to these views right now; I'm still contemplating whether it should also provide a link text section (for SEO purposes etc.)  and/or other features.
     
    The magic behind the scenes
     
    The admin tool (Process module) is about half of Link Checker; the other half is a PHP class called Link Crawler. This is a tool for collecting links from a ProcessWire site, analysing them and storing the outcome to custom database tables.
     
    Link Crawler is intended to be triggered via a cron task, but there's also a GUI tool for running the checker. This is a slow process and can result in issues, but for smaller sites and debugging purposes the GUI method works just fine. Just be patient; the data will be there once you wait long enough :)
     
    Now what?
     
    For the time being I'd appreciate any comments about the way this is heading and/or whether it's useful to you at all. What would you add to make it more useful for your own use cases? I'm going to continue working on this for sure (it's been a really fun project), but wouldn't mind being pushed to the correct direction early on.
     
    This module is already in active use on two relatively big sites I manage. Lately I haven't had any issues with the module, but please consider this a beta release nevertheless; it hasn't been widely tested, and that alone is a reason to avoid calling it "stable" quite yet.

    Screenshots

    Dashboard:

    link-checker-dashboard.png

    List of broken links:

    link-checker-broken-links.png

    List of redirects:

    link-checker-redirects.png

    Check now tool/tab:

    link-checker-check-now.png

    • Like 18
  2. @Matthew: no one still seems to have a clear idea, so I guess we can only assume that something big (that they've got no control over) broke down. Network failure or something like that, perhaps?

    Happens to the best of us, but admittedly they could've explained things in a bit more detail. I was trying to write something here when the downtime started. Kept tabs on it for the first hour or so in hopes it'd be fixed soon, but well..

  3. For your first scenario I can't quite see what the big benefit of session would be, though that depends on the structure of your page and the way you build this thing in the first place. Some possible solutions I'd consider:

    • If the edit view opened in modal box is an external page (not part of this page or put together on the fly) then you'll have to somehow let it know what it is that you want to edit.. and in that case I'd rather provide it with page ID and field ID (or name) as GET params.
    • If the edit view is generated and filled out on the fly (and submitted by AJAX, I'd assume) you can just grab the content from page using JS without the need to store anything in session variables.

    In your second scenario I'd also consider using GET params (specify comment ID to quote from) or storing either the comment ID or even the whole quoted text (unless you really need to support storing *a lot* of data) in JS cookie. This is an example of situation where you probably don't need to make sure that quoted text is somehow protected (in most cases user can edit it upon posting anyway), so storing that data in session provides very little extra value.

    --

    Considering sessions vs. other methods of storing run-time data in general:

    Biggest reason I tend to avoid sessions is that as the amount of data stored in session grows in size, so do the memory requirements (and, in extreme cases, also the disk space requirements) of your site. At least some PHP versions load whole session data to memory when session is started, so you can imagine what having tens of kilobytes, hundreds of kilobytes or even megabytes of data (a worst case scenario, but still) does on a site with a lot of simultaneous users.

    With sessions you'll also want to make sure that you're cleaning stored session files properly (which, by the way, isn't always as trivial as it may sound) and preferably clearing stored values run-time too, especially if you're storing a lot of content. GET params, on the other hand, don't consume any extra memory.. and storing stuff in JS cookies only consumes client resources.

    As for why one would prefer to use sessions, biggest benefits are that a) session variables make it possible to "hide" the mechanism behind this all from the user and b) session data is much harder (practically impossible, unless you've made it possible yourself) to tamper with (it's easy to try out different values for GET params or alter data stored in JS cookies).

    Ease of use is a very real benefit too, of course. Storing data in session variables is often the least painful route.

    Tradeoffs, tradeoffs.. but then again, isn't that what web / software development is always about? :)

    • Like 4
  4. @tobaco: have you tried setting publish_until to current date -- or any other date between now and publish_from? That should do the trick and is, at least in my opinion, preferable to altering existing values (those could be useful for audit purposes etc.)

  5. @LostKobrakai: right, the polling frequency was explained on the site after all -- 15 minutes "or even faster". I guess I didn't read the site carefully enough.. and the details being so scarce could be a result of each channel being so different from others. Might have to take another look :)

    One thing that's still unclear to me is whether using the service costs anything and if there are any kind of limitations etc. to it's use? Also, is it possible to add things like new channels, or are these provided by the developers themselves and/or their official partners?

    Just noticed that they've got specific channel for WordPress, but couldn't find a channel that would allow me to do something less specific via web, i.e. trigger an action on a ProcessWire site based on email. I might be misunderstanding something here, though, so perhaps I don't even need a channel for that? :)

  6. I checked my database, and the fields were created as a table. This is confusing. I am not sure of the terms because of that. I was told that a table is a template, buy myphpadmin revealed that a field was a table

    You're right. Each field requires a table of it's own. "Templates = tables" is figurative speech -- that's just roughly how they function from the developers point of view.

    The main point here is that each template is a collection of fields and in this way resembles table in database. Each page is connected to one template, and thus that template defines what data this particular page can store, and that's also why some people prefer to think that "pages = rows".

    Does this make any more sense to you?

    One noteworthy difference from database concept is that each field can belong to multiple templates. This is done so that fields with identical configurations (such as "body" field with same tools) can be used in many templates without having to add and configure new field for each template.

    As a side note, most developers using ProcessWire never dive into the real database structure. You don't have to do that to work effectively with ProcessWire. Don't get stuck at those things since that's something that the system handles for you (and, in fact, it's never recommended to perform direct SQL operations on existing tables).

    • Like 6
  7. Great thing about Pushover for my use case is that you can configure critical alerts to disregard things like phone being on silent mode.. and, of course, those can also trigger an alarm-sound that's sure to wake you up in no time. That combined with it being real-time (push, not pull) and a common hub (account) that all messages can go through (making it possible for one email or API request to trigger notifications on multiple devices simultaneously) make it pretty awesome :)

    It would seem that IFTTT might supports something similar, but from different point of view: receiving an email (or text message or any other supported trigger) can trigger an event, which are apparently provided as applications of sorts. Interestingly one of those events is Pushover, so I'm guessing that "Pushover-like" functionality probably isn't built-in in IFTTT. @Diogo (or anyone else with experience using this), am I getting this right? :)

    IFTTT website is so incomprehensible ("marketing-oriented" might be the correct word here -- a teaser with "join now" buttons everywhere) that I couldn't be really sure about anything. Being more than a bit stubborn, I refuse to install an app just to see what it does (or, better yet, "join" something that doesn't even explain what the heck it really is that I'm joining..)

    • Like 2
  8. As a quick heads-up I've just updated the module (twice, as it turned out that Joss' problem was related to certain regexp not expecting .co.uk TLDs..) and it should now work with the new Google Maps URL format.

    The problem is that this new format is more difficult to work with than earlier one, so I'm doing some ugly tricks (such as relying on old maps URLs) to keep things working. This is not a good solution in the long term, which is why an extensive rewrite of this module is going to be required soon, unless there's some solution I just haven't thought about yet.

    I've added a new checkbox to module config, titled "use coordinates". If this is checked, new maps URLs will be embedded based on coordinates, which is accurate but also replaces place name with those same coordinates. If it's not checked, location name will be used instead, which is far less accurate and for Finnish place names, for an example, practically didn't work at all.

    Like @Hani pointed out above, old Maps URLs still work -- which is a very good thing, since at least it means that old content should (for the time being) remain intact :)

    • Like 2
  9. I hope there's someone here that can help me.. I've set up this site (www.sejero-festival.dk) on my localhost and Hanna Code works excellent.

    However, and I've tried two different hosts now, when I deploy the site, the Hanna Code module fails miserably when I try to go to the setup part :

    One of the hosts has the following specs:

    PHP: 5.3.28

    MySQL: 5.5.37

    Does anyone have a clue what might cause this error?

    Is this still a problem? Does anything else work, i.e. is this the only place that's broken on this site?

    Sounds like your database wasn't reachable at this point, but I'd imagine that causing issues elsewhere too. Unless, perhaps, it's somehow related to PHP's mysqli extension on that server (just about all core features have already been migrated to PDO instead).

    For some reason I cannot get hanna code to work with php. I can get html to work just pine but php will not.

    Here is my code

    <?php
     
    foreach($page->children as $product) {
     
        if($product->image) {
           // $image = $product->image->width(150);
            $img = "<img src='{$image->url}' alt='{$product->title}' class='flt-left-img' />";
        } 
        
        else {
         
            $img = "<span class='image_placeholder'>Image not available</span>";
        } 
     
        echo "
            $img
            <h3>{$product->title}</h3>
            <p>{$product->product_description}</p>
            <hr />
            ";
    }
    ?>
     
     
     
    The only thing that will output on the page is the product description but all other content from the body disappears.

    Only weird thing I can spot in your code right away is that you've got $image = ... commented out, yet on the next line you try to use $image->url, which obviously won't be available here. I'm assuming that this was just something you did for testing and forgot there.

    When you're saying that only output is the product description, it's coming from this Hanna Code snippet, right? Are you seeing empty <h3></h3> tags and then <p></p> tags with product description inside them and nothing else, or..?

    I'm not really sure what you're referring to when you say that all other content from the "body" disappears; <body> element of your page, $page->body, or something entirely different? If you view the source of the page, is there anything weird there -- or in your log files, either in site/assets/logs/errors.txt or Apache logs?

  10. Thanks! There was an issue in PageSnapshot.module preventing it from working properly unless FieldtypeRepeater was installed too.

    I've just pushed fix for that to GitHub, so if you could fetch latest version and see if it works any better?

  11. Not really a ProcessWire wish, but related:

    Currently it's only possible to state a supported ProcessWire version for a module in the modules directory, and that has to be 2.0, 2.1, 2.2, 2.3 or 2.4. Anything beyond that has to go to comments (where it's rather easy to miss).

    The most basic thing I could think of would be the ability to provide required ProcessWire version as '2.4.1' etc. in case that nearest minor version (2.4 in this case) isn't supported. I'd also like to specify any other modules that are required -- and perhaps even more specific things than that, such as PHP version or specific extension (Imagick comes to mind first).

    • Like 4
  12. @sid: sounds kind of like you could have an earlier version of Page Snapshot module installed. Could please you check that you're running version 1.1.18 of it?

    I'm sorry for the confusion about supported ProcessWire version. Dev branch is really needed to run this module. My original post mentioned that the module requires "2.4.1 or later, i.e. current dev branch", but since modules directory doesn't allow submissions unless you define a version ranging from 2.0 to 2.4 as compatible.. well, I can see how that could happen.

  13. If you just need to remove those specific parts (i.e. all URLs start with index.php/news/11-latest-news/) then try this:

    RewriteRule ^index.php/news/11-latest-news/(?:[0-9]*-)([^\/\.]*).html$ http://www.domain.com/news/latest-news/$1/ [L,R=301]

    If that's just one example and you're looking to remove index.php and all sequences of [0-9]*- from beginning of pages, you might have to do something like this instead:

    RewriteRule ^index.php/(?:[0-9]*-)?([^\/\.]*)/(?:[0-9]*-)?([^\/\.]*).html$ http://www.domain.com/$1/$2/ [L,R=301]

    RewriteRule ^index.php/(?:[0-9]*-)?([^\/\.]*)/(?:[0-9]*-)?([^\/\.]*)/(?:[0-9]*-)?([^\/\.]*).html$ http://www.domain.com/$1/$2/$3/ [L,R=301]

    Note: that last one is pretty ugly and requires new rewriterule for each level of content. I'm not aware of any sensible way of implementing string replacement via rewriterules, so if this really is the case you might benefit from actually doing this with PHP -- just point all index.php requests to custom PHP script (or page or whatever) and do 301 redirect from there after a bit of preg_replace() magic.

    • Like 1
  14. @muzzer: markup cache is definitely a great tool if you've got otherwise highly dynamic pages that can't be cached as a whole. If, on the other hand, the page stays pretty much same on each page load, I'd suggest looking more into template level caching -- that way you'll have entire page cached instead of caching separate regions and then combining those on the fly.

    If you're really interested in speeding up your sites, ProCache is the ultimate solution. For the most part it'll be serving static pages without involving PHP or MySQL at all, which will always beat the speed of any page built on the fly (regardless of what CMS is used).

    Just saying.

    • Like 1
  15. [...] while Lister is clearly an uber powerful Admin tool, as a by-product piece of functionality, could a Lister 'view' (a preconstructed 'find') be made visible on a public page (read-only, fixed results, no controls available to the public)?

    It would be a sort of query and results builder for custom finds.

    What Ryan and Antti are saying here definitely makes sense (and they're the ones to know the module best anyway) but this was exactly what we were wondering too when that first Lister screencast hit our office. For 90% or more of all product, news, event etc. lists and/or tables Lister views would be more than enough.

    No custom, site-specific code and even being able to allow customer decide exactly what is visible and where.. and then modify that whenever needed -- how damn cool would that be?

    In case you ever decide to take Lister to that direction, it'd be a killer feature for a lot of sites (and an awesome time-saver for people building those sites), but I totally understand that this was never your intention and it would probably require a ton of extra work. Perhaps even so much that building another tool just for that would make more sense. Still, +1 for this idea from me/us :)

    • Like 5
  16. While testing I had mostly similar results to those of Soma for the basic site profile out-of-the-box (on localhost), but some actual production sites took a bit longer to load -- around 200-300ms. If we really wanted some comparable results, we'd need all the details of aforementioned tests (including a test site to run those tests on on) just to rule out different measuring methods and side-effects of installed modules :)

    • Like 2
  17. Sorry if I'm making this even more confusing, but it's really not that difficult, once you grasp the general concept:

    Consider all data coming from the user dirty. In PW anything that comes from $input. It has to be sanitised and it's always better to be too strict than too lenient about it; don't worry about being overly cautious, that very rarely causes any issues while not being cautious enough.. well, that's another story entirely.

    Also, there's no such thing as "general sanitizing". It depends on what kind of values are valid in this specific use case. If possible, compare to an array of valid values, but if/when that's not feasible ...

    • if you only want integers, typecast value to int first: $value = (int) $input->post->value;
    • if you only want plain text, use $sanitizer->text(): $value = $sanitizer->text($input->post->value);
    • if a sanitizer feature matching your use case exists, use that; if you want to check for valid page names, use $sanitizer->pageName(), and if you want to check for valid emails then use $sanitizer->email() etc.
    • if you're inserting user data in HTML, make sure it doesn't contain anything that could break the markup: <input type="text" value="<?php echo $sanitizer->entities($input->get->value); ?>" /> to convert all applicable characters to entities (such as " => ") or at least <input type="text" value="<?php echo str_replace('"', '', $input->get->value); ?> /> to remove double quotes, which would obviously cause problems here etc.

    If you're still worried that you don't know enough of this, try Google; there's a lot of various tutorials about the subjects of validating, filtering, escaping and encoding data (the terms are related but have slightly different meanings, by the way). This Smashing Magazine article, for an example, explains the basics pretty well.

    Another resource I'd highly recommend is SlideShare presentation from Chris Shiflett, "Evolution of Web Security". The scope of this is much wider than just sanitizing user data, but that's all stuff that any decent web developers should be aware of anyway, so it definitely won't hurt you :)

    • Like 8
  18. Hear me out, guys!

    Based on extensive user surveys and after tremendous amounts of solo brainstorming (and other other proven methods, such as wearing all of the six thinking hats simultaneously) I've just come up with a new marketing strategy (and slogan) that will most definitely make us unbeatable:

    post-175-0-61030300-1402470595_thumb.png

    How's that for a slice of fried gold?

    .. and on a more serious note, I've also got tremendous amounts of respect for Kongondo and his work here. Never visited MODx boards and still don't know what the heck Wayfinder is, but he's done some pretty awesome stuff here too :)

    In my case it was Antti who threatened to break my legs brought ProcessWire to the company we both worked at back then. Ryan's video was my first contact with the system itself and the thing that really convinced me that Antti wasn't just delirious -- this thing actually looked great!

    • Like 11
  19. @yellowled: that sounds just about right. ProcessWire uses Blowfish algorithm for passwords whenever possible (PHP 5.3.0 onwards) and a stronger version of it if PHP version is 5.3.7 or newer.

    Passwords created in earlier versions will get the update notice and there's at least a chance of problems arising if you go from PHP 5.3.0-5.3.6 to 5.3.7 or newer -- or vice versa.

    If I'm reading you correctly and the same site can be accessed with multiple PHP versions, I'd assume there being quite a bit of weirdness. That's a problematic situation in many ways, and this is just one of those :)

  20. Couldn't find a really clean way to do this at the moment, but since the view used in pwlink TinyMCE plugin is essentially ProcessPageEditLink and it's execute() method is hookable, you could try tapping into that and altering the resulting markup (return value of said method).

    This is first request I've seen for such a thing, but if this sounds like something that would make sense in more cases I'd suggest asking Ryan (by creating a GitHub issue for it) if adding a better way to do this, i.e. new hookable method somewhere before the form markup is generated, would be possible.

    Edit: almost forgot: welcome to the forum! :)

×
×
  • Create New...