Webrocker Posted November 26, 2015 Share Posted November 26, 2015 Hi, consider this setup: - "Shows" having a certain "Event"; stored in a page field "event_relation" - "Events" having multiple "Shows"; stored in a page field "show_relation" There are about 86,000 shows and 17,000 events in the database, together with the normal cms pages there are over 100,000 rows in the pages table. I now query first the shows matching a certain criteria, and then query the events having these shows, in order to use their data for display. I had the server timing out several times; even with memory settings maxed up. I created the following bit for testing, already using a raw sql query to get to the page ids of the shows, then using this in a "pw-style" selector for the events. echo '<h1>RAW mySQL</h1>'; $start = microtime(true); $sql = 'SELECT pages_id FROM field_date WHERE data>"2015-11-28" AND data<"2016-01-15"'; $query = $database->prepare($sql); $query->execute(); $res = $query->fetchAll(PDO::FETCH_COLUMN); $result = count($res); $end = microtime(true); $diff = $end - $start; $res1 = implode('|',$res); echo '<p>Result-Count: ' . $result . '</p>'; echo '<p>Duration: ' . $diff . ' Seconds.</p>'; echo '<h1>PW</h1>'; // process-wire: $start = microtime(true); $res = $pages->find('template=event_item,shows_relation='.$res1); $result = count($res); $end = microtime(true); $diff = $end - $start; echo '<p>Result-Count: ' . $result . '</p>'; echo '<p>Duration: ' . $diff . ' Seconds.</p>'; # RAW mySQL Result-Count: 26770 Duration: 0,021031141281128 Seconds. # PW Result-Count: 7229 Duration: 40,812306880951 Seconds. Look at this -- 40 seconds, on a root server in a testing environment, no "real" traffic interfering. O_o I had a look into mySQLs log files and processlist, and I noticed that the query that PW is sending contains LOTS of OR statements like this: ... OR ((field_shows_relation.data='103472') ) OR ((field_shows_relation.data='103473') ) OR ((field_shows_relation.data='103477') ) OR ((field_shows_relation.data='103478') ) OR ((field_shows_relation.data='103479') ) OR ((field_shows_relation.data='103480') ) OR ((field_shows_relation.data='103481') ) OR ((field_shows_relation.data='103482') ) OR ((field_shows_relation.data='103483') ) OR ((field_shows_relation.data='103486') ) OR ((field_shows_relation.data='103490') ) OR ((field_shows_relation.data='103491') ) OR ((field_shows_relation.data='103492') ) OR ((field_shows_relation.data='103496') ) OR ((field_shows_relation.data='103497') ) OR ((field_shows_relation.data='103515') ) OR ((field_shows_relation.data='103518') ) OR ((field_shows_relation.data='103526') ) OR ((field_shows_relation.data='103527') ) OR ((field_shows_relation.data='103528') ) OR ((field_shows_relation.data='103529') ) OR ((field_shows_relation.data='103530') ) OR ((field_shows_relation.data='103531') ) OR ((field_shows_relation.data='103532') ) OR ((field_shows_relation.data='103533') ) OR ((field_shows_relation.data='103534') ) OR ((field_shows_relation.data='103535') ) OR ((field_shows_relation.data='103536') ) OR ((field_shows_relation.data='103537') ) OR ((field_shows_relation.data='103539') ) .... in the example above like 26,770 rows of these "OR" statements. I don't know if this is the reason why those queries take a long time, and I am by no means skilled in query-crafting, but wouldn't it be easier to have an "IN( a,b,.....,n )" statement for that? The problem is, we are using quite a lot of these "page" fields to reference lots of shared data like venues and locations - and some of these use other page relations and so the resulting queries fired by the PW selectors are very very long and complex, and so the percieved performance of the page using this is rather sluggish. As long as the data queried stays under like 1000 results, everything is ok, but the it starts to increasingly getting slower. Has someone else had similar experiences? Should we stay away from the API selectors for such an amount of data or have we shot ourselfs in the foot architecture-wise by using all those page-fields? thanks, Tom Link to comment Share on other sites More sharing options...
Wanze Posted November 26, 2015 Share Posted November 26, 2015 Hi, The problem is that $pages->find() returns Page objects, it takes a lot of memory and time to build > 7000 objects. The solution is to paginate your results. Are you displaying over 7000 results on your site? 1 Link to comment Share on other sites More sharing options...
Webrocker Posted November 26, 2015 Author Share Posted November 26, 2015 Hi Wanze, thx for the fast reply … the problem is, the queries are timing out even before the end result (which is paginated) will be returned -- it seems that the selectors that are querying the page-relation fields try to look up several thousand page objects in between - even if the result in the end is only a handful of matching pages… :-/ but again, maybe there is a problem in the way we are trying to get to the information, not in the way pw returns it. For the moment I was wondering wether the repeated "OR" lines in the sql statement can make a difference… thx Tom Link to comment Share on other sites More sharing options...
Webrocker Posted November 26, 2015 Author Share Posted November 26, 2015 to illustrate the above; this is with $res = $pages->find('template=event_item,shows_relation='.$res1.',limit=50'); # RAW mySQL Result-Count: 26770 Duration: 0,030179023742676 Seconds. # PW Result-Count: 50 Duration: 48,704106092453 Seconds. Tom Link to comment Share on other sites More sharing options...
Craig Posted November 26, 2015 Share Posted November 26, 2015 Is there a specific reason you're running a MySQL query first? Could you rewrite this part to use one PW selector? Such as... template=event_item, shows_relation.date>2015-11-28, shows_relation.date<2016-01-15 5 Link to comment Share on other sites More sharing options...
Webrocker Posted November 26, 2015 Author Share Posted November 26, 2015 Hi Craig, wow, this simple modification sped up the query quite a bit O_o ) $res = $pages->find('template=event_item, shows_relation.date>2015-11-28, shows_relation.date<2016-01-15'); # RAW mySQL Result-Count: 26770 Duration: 0,023936033248901 Seconds. // not used in the following: # PW Result-Count: 7342 Duration: 5,9315440654755 Seconds. THANK YOU. Link to comment Share on other sites More sharing options...
Craig Posted November 26, 2015 Share Posted November 26, 2015 No problem Tom That's definitely an improvement, but personally I think nearly 6 seconds is still a bit long for a query to execute that would be displayed on a page request. I would suggest limiting it to X number of items first to see if that improves it. There may be other optimisations that can be made as well, or caching could be introduced depending on the use case of the data, how frequently it changes or how dynamic it all needs to be. 4 Link to comment Share on other sites More sharing options...
Webrocker Posted November 26, 2015 Author Share Posted November 26, 2015 hi craig, yes, the "real" one uses a limit. if set to 20, this takes around two seconds. do you happen to know of a way to "see" the mysql queries PW is sending? if I introduce a sort=shows_relation.date to the above, it again will time out… thx Tom Link to comment Share on other sites More sharing options...
bernhard Posted November 29, 2015 Share Posted November 29, 2015 If you turn on debugging you can see the MySQL queries on the bottom of the admin Link to comment Share on other sites More sharing options...
Webrocker Posted November 29, 2015 Author Share Posted November 29, 2015 Hi BernhardB, is it possible to have this enabled for the front end as well? thxTom Link to comment Share on other sites More sharing options...
kongondo Posted November 29, 2015 Share Posted November 29, 2015 Yes. Use getQueryLog() (will only work with debug on) See example code here: https://processwire.com/talk/topic/11483-duplication-in-query-results/?p=106994 2 Link to comment Share on other sites More sharing options...
Webrocker Posted November 29, 2015 Author Share Posted November 29, 2015 thank you, @kongondo, this will be very useful! Link to comment Share on other sites More sharing options...
MindFull Posted November 30, 2015 Share Posted November 30, 2015 You could also turn on mySQL's general_log and then tail -f the log file while you step through your code during debug. 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now