Jump to content

Paginating 5 million pages.


elabx
 Share

Recommended Posts

Hi everyone! I'm with the task of paginating 5 million pages and things are getting a bit slow here lol

I found out that you can disable counting within the selector and that bring huge performance benefits, basically making the queries instantly.

I am just wondering I'd still like to have pagination, specially because the sets of data won't really change often so I only need to count once in a while and the find() call seems to hit the database and counting on every request. 

Maybe I'm missing something and there is a way for the count value to stay cached? Would you recommend me to hack a bit into the pagination module?

Link to comment
Share on other sites

I guess you could disable counting in your selector and get the count of matching pages separately via $pages->count() on the first page only. Then pass the count in the query string (or store it in $session) and use PaginatedArray::setTotal() to set the total count to the PageArray on each pagination. And if necessary you can fake the pagination entirely as shown by Ryan here:

 

  • Like 1
Link to comment
Share on other sites

With 5 million pages do people actually click through who knows how many pages you get? Generally offset/limit pagination scales quite badly because it always needs to count the offset part as well. So the higher the page number (offset) the slower the query will get. For big datasets cursor based pagination is usually adviced, but on processwire you‘d need custom sql for that. Also it will no longer give you pagination in terms of „you‘re on page 6043 of 20383“. You can only do next/prev. But from an UX pov page numbers that big aren‘t useful in the first place. Having means of filtering down to a more manageable result set is what I would rather strive for.

  • Like 6
Link to comment
Share on other sites

7 hours ago, LostKobrakai said:

With 5 million pages do people actually click through who knows how many pages you get? Generally offset/limit pagination scales quite badly because it always needs to count the offset part as well. So the higher the page number (offset) the slower the query will get. For big datasets cursor based pagination is usually adviced, but on processwire you‘d need custom sql for that. Also it will no longer give you pagination in terms of „you‘re on page 6043 of 20383“. You can only do next/prev. But from an UX pov page numbers that big aren‘t useful in the first place. Having means of filtering down to a more manageable result set is what I would rather strive for.

Agree an all your points! Thanks for the feedback!

Link to comment
Share on other sites

  • 2 weeks later...

I'm basically removing pagination cause I feel @LostKobrakai's comments on UX make all the sense and will rather prioritize search which is actually fast enough with simple selectors. For pagination replacement I will just go with a "Next" button. 

How is search speed looking for you? (if you are doing it)?

Link to comment
Share on other sites

On 8/28/2020 at 10:25 AM, Erik Richter said:

earch is fast - as long as one picks at least one filter/selector. Displaying all 1,2 mio results takes about 14-16 seconds, even with the limit and pagination. I guess its because of the count?

 You can pass in a selector like this: get_total=0 so it doesn't do the counting, that will get the find to  much better performance.

Check this topic: 

 

I'd still wonder if there isn't a way MySQL could cache that count? Just out of curiosity.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...