Jump to content

nik

PW-Moderators
  • Posts

    294
  • Joined

  • Last visited

  • Days Won

    4

Posts posted by nik

  1. We encountered a performance problem with a custom search today, finding out that one particular database query lasted seconds. Digging deeper revealed it has to do with repeaters when there are enough of them.

    The site has pages for companies (about 1500 of them) each of which has repeater items attached. Most of them have one single item, total amount of repeater items being under 2000. This leads to about 3500 extra pages in the tree as a penalty for using repeaters (yes, there would have been other possible approaches as well). So, the amount of pages isn't anything massive.

    Now, when a find("parent=/companies/, title%=keyword") is executed, the repeater fieldtype catches the use of title field (used in the repeaters as well) and tries to ensure no repeater pages get returned by adding !has_parent=<repeaters-page-id> to the selector string. While this is logically right, the resulting SQL query becomes slow due to the extra join used to filter out repeater pages.

    With those 1500 + 1500 + 2000 = 5000 pages there are about 9000 rows in the pages_parents table (which is used for implementing has_parent selector). This isn't that much either, but it becomes a problem when there's a join between pages and pages_parents.

    I'm not saying it wouldn't be possible to optimize the query, but as I'm not certainly skilled enough to do it, I went down another road. And the solution is dead simple - actually it's possible to catch it by just looking at those bolded selector strings above ;).

    Ready? Ok, if we're searching for pages whose parent is /companies/ why should pages under /processwire/repeaters/ be explicitly filtered out? There's no way they'll match, so we'll just skip the extra repeater filter. And if you we're actually trying to find pages under /processwire/repeaters/ you didn't want the filter to kick in either (well, don't do that anyways, they're internals).

    Here's a little diff showing the change I made to fix this thing for us (there's a pull request on the way Ryan):

    
    $ git diff
    diff --git a/wire/modules/Fieldtype/FieldtypeRepeater/FieldtypeRepeater.module b/wire/modules/Fieldtype/FieldtypeRepeater/FieldtypeRepeater.module
    index 20bb52c..ab9db28 100644
    --- a/wire/modules/Fieldtype/FieldtypeRepeater/FieldtypeRepeater.module
    +++ b/wire/modules/Fieldtype/FieldtypeRepeater/FieldtypeRepeater.module
    @@ -138,6 +138,10 @@ class FieldtypeRepeater extends Fieldtype implements ConfigurableModule {
    // ensure that the repeaters own queries work since they specify a templates_id
    if($name == 'templates_id' && in_array($selector->value, $templatesUsedByRepeaters)) $includeAll = true;
    
    + // optimization: if parent (or parent_id) is given, there's no need to explicitly exclude repeaters
    + // TODO: has_parent with values other than parents of repeaters-page could be catched as well
    + if($name == 'parent' || $name == 'parent_id') $includeAll = true;
    +
    if($includeAll) break;
    }
    }
    

    Another performance issue solved. Not a very common one I think, but still worth a fix.

    Edit: Ryan, no pull request this time. Didn't manage to do it right. Next time there'll be a branch before any changes, maybe that'll do it.

    • Like 3
  2. One way would be to use a separate file for the view. I tried not to use any html in my .module-file and instead ended up with a TemplateFile instance. The view is created during init() and different variables are populated depending on the given input. Then view's render() method is called to output it all.

    Hmm, the explanation isn't very clear, but if this was what you were trying to achieve then you might want to take a look at my Selector Test module's implementation at https://github.com/n...essSelectorTest. The lines with "$this->view" in the module-file and the view itself (view.php) should give an idea of what I was trying to say above. :)

    If you're building a module, even a process one, then this could be helpful. If this came up somewhere else, you'll probably get much better suggestions later on from others.

    • Like 1
  3. Looking at Antti's example of templates/home.php (in his post), he's setting $page->layout to "frontpage". That is the name of a file located in /templates/markup/layouts/, without the ".php" suffix. Then on the very last line in that same file, templates/markup/index.php is included. And that file spots the assigned value "frontpage" at $page->layout, causing it to include the file from template/markup/frontpage.php. Finally frontpage.php uses $page->masthead and $page->main to output a page with a frontpage layout. By the way, "masthead" and "main" aren't probably fields in the template either.

    Hmm, I wonder if I got you wrong... (Looking at Antti's last response to this thread - he's got some fast fingers on that mobile..) In the example I was trying to lay out, the template "home" would have just a name defined as there is a file at templates/home.php.

    Well, you're saying you got it already, so maybe I don't have to continue anymore :).

  4. (Ah, Antti was faster, but I'm trying to explain things from another point of view I think. Here goes.)

    Apeisa is using "first level" of template files as controllers. By "first level" I mean the files that are named after the templates themselves. Then the controllers can specify which layout should be used, giving the possibility to construct output according to some common layouts (frontpage, threeColumns, etc). The file /markup/index.php checks whether $page->layout has been set and uses either the given layout or the default one to render the output.

    "layout" does not have to be a field in any template as you can set any key to a value in a Page object by saying $page->key = "value". That is equivalent to saying $page->set("key", "value"). Fields in the template assigned to this page can be used in the same way, but as I said, there does not have to be such field defined beforehand.

    Of course assigning values to keys that don't map to any field in a template doesn't get them saved to the database if $page->save() was called.

    Hope this clarifies it a bit.

    • Like 2
  5. This module creates a page in the ProcessWire admin where you can test selectors and browse page data and properties without editing a template file or a bootstrapped script.

    Given selector string is used as is to find pages - only limit is added, if given. Errors are catched and displayed so anything can be tested. Pages found with a valid selector are listed (id, title) with links to the page.

    I was thinking this would be useful for someone new to ProcessWire, but it turns out I'm using it myself all the time. Maybe someone else finds it useful as well. :)

    Module can be downloaded here: https://github.com/n...essSelectorTest

    Modules directory: http://modules.processwire.com/modules/process-selector-test/

    Features

    • Edit selector string and display results (and possible errors as reported by ProcessWire)
    • Explore properties and data of matching pages in a tree view
      • Language aware: multi-language and language-alternate fields supported
      • Repeater fields and values
      • Images and their variations on disk
      • More data is loaded on-demand as the tree is traversed deeper
    • Quick links to edit/view pages, edit templates and run new selectors (select pages with the same template or children of a page)
    • Page statuses visualized like in default admin theme
    • Add pagination

    Screenshots
    post-481-0-82978600-1352735950_thumb.pngpost-481-0-58536800-1352735952_thumb.pngpost-481-0-98417000-1352735953_thumb.pngpost-481-0-40919900-1352735955_thumb.png

    • Like 15
  6. I think it's the way PW handles selector values, see my comment on the issue at GitHub (for some reason I thought it was better to comment there at first - could have said something here as well).

    That field (pages.name) is a varchar(128) but whenever a page name that can be interpreted as an integer is given, the value in the SQL query is not quoted and thus not interpreted as a string as it should be when the field is declared as varchar.

    This may get tricky so I didn't want to go any deeper myself - I'll let Ryan do his magic :).

  7. A recent post on PageArray find() made me try and fix a couple of things discovered there (and in some earlier posts, I think). What I came up with is a few modifications to WireArray class (base class for PageArrays):

    • sort() accepts multiple fields to sort by. Fields can be given as a comma separated string "-modified, title" or an array of strings array("-modified", "title")
    • find() supports start=n and multiple sort=field selectors
    • some optimizations for sorting and limiting even quite big arrays without losing performance in certain conditions

    The new implementation seems to give same results as the old one where applicable (given only one field to sort by and no start selector for find()), but more tests are definitely needed. I've tested this on a site with more than 15000 pages and find()'ing 3000 of them at once for sorting/finding with the API calls. Performance is about the same as it was before. Even giving more than one field to sort by doesn't make it noticeably slower if limit is being used as well (those optimizations actually work).

    The replacement for wire/core/Array.php is attached if anyone's interested in trying it out.

    Array.php

    I tried some different methods of sorting as well, but just to find out Ryan had a good reason for all the little things I couldn't get a hold of at first :). PHP sorting methods being unstable must have turned much of my hair gray in the last couple of days...

    While testing this I found out that sorting by a field with empty values at some records would act differently when done at initial $pages->find() (database query that is) compared to $aPageArray->find(). While this isn't affected by the WireArray code, I may dig deeper into it later.

    Giving sort=random works as it did before, but the code doesn't handle the situation it's not the only sort field given. Left a TODO-note in the comments on this (and the missing trackChange() call as well).

    @ryan: I will try to make a pull request as well just for the fun of it (being new to git). Do whatever you like with the code if you see any potential in it. You may want to rename function stableSort() and its last argument to something else at least. I know I'm not too happy with them.

    • Like 6
  8. That second one was reported by a fellow worker yesterday as well. I'm able to reproduce it simply by dragging a page and dropping it back to its original place. Javascript error is: "Uncaught TypeError: Cannot read property 'error' of null" at ProcessPageList.js:655. At stopMove() ajax post callback gets a null for the data parameter. Error can be fixed simply by replacing "if(data.error)" with "if(data && data.error)". Not sure what has changed though (if anything) and where so that isn't necessarily the right fix either.

    The first one then. The destination parent page has to be selected first (before clicking 'move') for you to be able to move a page there. That new parent page should be highlighted as well. Could that have been the cause for the first problem? Still, it seems to be a bit challenging to find the right place where to drop the page so that it would become a child for that new parent. I guess my fingers aren't working well enough either :).

    I wonder if there was something to be done for this to be easier? (I have a feeling this has been discussed earlier, anyone?)

    • Like 1
  9. I've got a habit of first reading the "definitive documentation", source code that is ;). That's where that filenameExists() came up.

    I mentioned viewable() as it's definitely something that's supposed to be used (as everything else in the documentation) rather than going for something undocumented. Still, it seemed to me that it's ok to use filenameExists() if that would solve your problem. I'm sure Ryan will tell if I were wrong.

    • Like 1
  10. I just wrote one answer but erased everything before posting it as I found almost everything in it dead wrong. Another try.

    And then I started to write another explanation, but after checking if I'm reading the latest code decided to give this even one more try. Maybe I shouldn't have tried to answer this now :).

    And here's the answer, finally: you're right in the sense that there actually is a bug in the code you're running. But this issue has been fixed by Ryan a month ago, so updating to the latest version from GitHub should fix your problem. I was reading the old version myself at first too...

    Edit: I did have something right in the first version of my answer. Here goes, again.

    Only one sort selector is supported at the moment, not even "sort=-rank,-created" works. There's a comment in the code saying "Currently only sorts by one field at a time".

    And find() for arrays does not support "start" at all, but this can be achieved by leaving "limit" (and "start") from the selector and calling $array->slice($start, $limit) instead for the sorted array.

    • Like 1
  11. @everfreecreative: I think Ryan meant what does that given echo output - did you try that exact thing Ryan wrote? If $templates->get() succeeds, it returns a template object and echoing that object should give an id. If it echoes nothing, then the problem is that a template can't be found. Then again, if it does echo a number (I think that's what it resolves to), the problem is somewhere else :). You could also try the same echo with hardcoded template name and tell us results from both (at least this one should output a template id).

    • Like 2
  12. Ryan,

    Found a tiny slip when reading through the source (ServicePages.module). At line 119 given limit is checked against hardcoded 100 instead of configuration variable maxLimit. maxLimit is used correctly to form the error string though. ;)

    Nice module, once again! Haven't used it anywhere yet, just studying to get an upcoming module of mine on the right track (ProcessHello skeleton appeared right on time too).

  13. The error you're getting does not come from 2nd line, but from the last. $form->render() checks whether a template has been set or not and now it apparently isn't. As the code looks allright and you're saying that echoing the content "works fine" (I'm presuming something is actually displayed), the problem must be that no template can be found with the given name.

    So, double check that the template name in $page->form_name actually matches an existing template - I'm betting it doesn't. A simple test checking that the chosen template was successfully retrieved would be a good thing to have here. Even if you changed from textfield to a some kind of selection from existing templates, there's always the possibility that a previously selected template does not exist anymore.

  14. I've got an almost similar issue as evan did - except for that I'd need this applied to parent page as well (parent.title instead of referenced_pages.title). So, Ryan, do you think there was a way to achieve this?

    I think so. It was a pretty simple matter supporting the subfields of page references, so I think the same would go for parent. I will look into it here more and hopefully get this added soon. Seems like a very nice thing to have.

    This isn't probably at the top of your todo list, but you wouldn't happen to have any news on this, would you? :)

    I'm still trying to avoid replicating fields from the parent page as there are multiple languages involved as well. Still that may be a better solution than the almost-fixed-sql-statement-thingie-module I currently have...

  15. Sounds like a possible RewriteBase problem to me, so I guess that's what onjegolders is referring to with the "~". Site has been developed at the document root on the localhost and then it's migrated to another server where it's run under a user account, for example. So RewriteBase setting would need to be changed to /~someusername/siteroot.

    • Like 1
  16. Your problem got solved a while ago, but a bit of information was left missing: the actual reason your code didn't work as expected. The original version is a bit complex, yes, but it's actually only a single ampersand away from working. (Well, I didn't read the output carefully through so there could be some other flaws as well).

    The problem is that $output is being passed by value to sub() when it needs to be passed by reference for sub() to be able to change it. So changing the function definition to "function sub($page, &$output)" does the trick. More on passing by reference for example at php.net article (not so good a reference though).

    But this is just to point out what was wrong in the first place in case someone was left wondering, as the code may seem right at the first glance. Of course you could modify it a bit in another way as well (returning the $output and assigning it back to where it belongs would work too) but let's leave it here as better solutions were already given by Soma more than a week ago :).

    • Like 2
  17. Thanks Gary, I'm sure your explanation will come in handy when trying to set things up!

    But actually this caught my eye for another reason:

    I was stepping through the code and occasionally, it would just time out. This was using PHP 5.4.4 and XDebug V2.2.0, the latter known to be "buggy". In this case, maybe there was issues with PW running in PHP 5.4.4 or maybe it was XDebug.

    To test, I backed of PHP back to the latest 5.2 release and an appropriate XDebug version. This works like a charm.

    I'm running ProcessWire locally on a Mac (OS X 10.6.8.) with PHP 5.3.8 and it seems to stall mysteriously every once in a while. It doesn't exactly time out, but loading a page just takes long, say 20 seconds or more. No errors or warnings can be found from error logs and everything else seems to be responding normally. I was just wondering could this be the same thing you were experiencing with the debugger? I've seen this behavior only on my local machine but with clean installations as well as with a bigger customer project. I think I'll have to look into this a bit deeper to find out what might be causing this. It does seem a bit odd though this hasn't come up before (or has it? anyone?), as there seems to be quite a few other PW users on a Mac as well.

    I was amazed that debugging seemed so alien to PW users. I really hate coding and hoping for the best so give me the ability to debug every time because when there's a problem, there's nothing easier than sticking a break-point in the code and seeing exactly what is going on.

    As Pete just said web developers tend to be using methods other than debuggers or even IDE's quite often. It's often just easier to try things out and add a few lines of echos to debug than run a debugger - especially if that means having to run a not-so-lightweight IDE as well. And when it comes to features with Javascript and/or strange CSS behavior we're out of PHP debugger's scope anyway.

    But I'm not trying to say one way is any better than the other. I guess it all comes down to what you're accustomed to and what makes You more productive. And of course it's about choosing from a set of tools depending on the problem you're solving. For me being far more a programmer (or even a software architect if I may choose ;) ) than a web designer it makes sense to use a debugger every once in a while. Having said that, there isn't a single web site I've ever built from scratch all by myself and that probably explains a lot :). My cup of tea are the things under the surface: making things possible (been building a CMS for some years), integrations, imports/exports, (web) applications, and so on. And with these kinds of things using a debugger would most probably have saved me a few hours every now and then if I ever had learned to use one properly.

    Anyway, my point must have been that it's great that ProcessWire doesn't make it impossible or even hard to use a debugger if needed. Being able to use the tools and processes you're familiar with may be crucial for anyone new to PW.

  18. Hmm, this sounds interesting. I haven't been using a debugger recently, actually ever with PHP, but I think I would find it a very useful tool for some situations even in this context.

    @Gazley: You might be in the minority with this kind of workflow, especially when it comes to everyday tasks, but I'm sure you're not alone.

    Could you maybe give a little explanation on which steps you went through to set it up? I'm guessing there were some parts you got right with trial and error, so it could be useful for someone else (like myself maybe ;) ) trying to get there to know where there may be some pitfalls along the way.

    • Like 1
  19. Nice work Nik and a quick fix / improvement by Ryan. @Ryan - maybe you should change Niks' forum status from "Newbie" to something more reflective of his skill. A manual override. ;)

    Well thank you Jeff. I think I'll just have to write a couple of more posts here at the forum to get rid of that status - but not only nonsense like this one :).

    • Like 1
  20. body|title|referenced_pages.title%=$query

    I've got an almost similar issue as evan did - except for that I'd need this applied to parent page as well (parent.title instead of referenced_pages.title). So, Ryan, do you think there was a way to achieve this?

    I've got a page structure of this kind: /countries/<country>[/<state>]/<city>/<data> (yes, the same again =) and would like to list data pages having search terms in their title OR in their parents (city page) title, sorted by the titles of the found data pages. Having thousands of data pages a lightweight pagination support is a must (one implemented at the database level that is) and making two different finds and joining the results afterwards would leave me without such.

    I think the proper way around this would be to use either-or selectors (as evan had tried in the first place) but they're only mentioned in the roadmap for PW versions 2.4+.

    My current solution is a module with a horrible hack and plenty of copied core code letting me do a find operation using plain SQL (UNION of the two find statements). It sure does work at the moment but as the site evolves, my mostly hardcoded SQL statements will most probably fail sooner or later. So, what I'm working on now is a more generic module implementing one variation of either-or selectors. It seems to be possible to get working, but has even more copied core code.

    But now this recent development on Page fields left me thinking there may be some other solution to my problem, an easier one I hope :). And at least it's time to get a second opinion on how to try to solve this. I'm eager to make this into a module (or even core code) if there happens to be a suitable solution available - one that I'm able to implement with a little guidance to the right direction.

    So what do you think, Ryan and others as well?

×
×
  • Create New...