Paginating 5 million pages.

elabx · August 13, 2020

Hi everyone! I'm with the task of paginating 5 million pages and things are getting a bit slow here lol

I found out that you can disable counting within the selector and that bring huge performance benefits, basically making the queries instantly.

I am just wondering I'd still like to have pagination, specially because the sets of data won't really change often so I only need to count once in a while and the find() call seems to hit the database and counting on every request.

Maybe I'm missing something and there is a way for the count value to stay cached? Would you recommend me to hack a bit into the pagination module?

Robin S · August 13, 2020

I guess you could disable counting in your selector and get the count of matching pages separately via $pages->count() on the first page only. Then pass the count in the query string (or store it in $session) and use PaginatedArray::setTotal() to set the total count to the PageArray on each pagination. And if necessary you can fake the pagination entirely as shown by Ryan here:

LostKobrakai · August 14, 2020

With 5 million pages do people actually click through who knows how many pages you get? Generally offset/limit pagination scales quite badly because it always needs to count the offset part as well. So the higher the page number (offset) the slower the query will get. For big datasets cursor based pagination is usually adviced, but on processwire you‘d need custom sql for that. Also it will no longer give you pagination in terms of „you‘re on page 6043 of 20383“. You can only do next/prev. But from an UX pov page numbers that big aren‘t useful in the first place. Having means of filtering down to a more manageable result set is what I would rather strive for.

elabx · August 14, 2020

7 hours ago, LostKobrakai said:

With 5 million pages do people actually click through who knows how many pages you get? Generally offset/limit pagination scales quite badly because it always needs to count the offset part as well. So the higher the page number (offset) the slower the query will get. For big datasets cursor based pagination is usually adviced, but on processwire you‘d need custom sql for that. Also it will no longer give you pagination in terms of „you‘re on page 6043 of 20383“. You can only do next/prev. But from an UX pov page numbers that big aren‘t useful in the first place. Having means of filtering down to a more manageable result set is what I would rather strive for.

Agree an all your points! Thanks for the feedback!

Erik Richter · August 28, 2020

Hey @elabx, what solution did you come up with in the end?

I also am paginating (when no filters have been selected) 1,2 Million pages, and the response time is about 15 seconds - which is not usable.

Let me know if you found a good solution - thank you!!

elabx · August 28, 2020

I'm basically removing pagination cause I feel @LostKobrakai's comments on UX make all the sense and will rather prioritize search which is actually fast enough with simple selectors. For pagination replacement I will just go with a "Next" button.

How is search speed looking for you? (if you are doing it)?

Erik Richter · August 28, 2020

@elabx thanks for the fast response!

Search is fast - as long as one picks at least one filter/selector. Displaying all 1,2 mio results takes about 14-16 seconds, even with the limit and pagination. I guess its because of the count?

elabx · August 28, 2020

On 8/28/2020 at 10:25 AM, Erik Richter said:

earch is fast - as long as one picks at least one filter/selector. Displaying all 1,2 mio results takes about 14-16 seconds, even with the limit and pagination. I guess its because of the count?

You can pass in a selector like this: get_total=0 so it doesn't do the counting, that will get the find to much better performance.

Check this topic:

I'd still wonder if there isn't a way MySQL could cache that count? Just out of curiosity.

Sign In

Paginating 5 million pages.

Recommended Posts

elabx

Link to comment

Share on other sites

Robin S

Link to comment

Share on other sites

LostKobrakai

Link to comment

Share on other sites

elabx

Link to comment

Share on other sites

Erik Richter

Link to comment

Share on other sites

elabx

Link to comment

Share on other sites

Erik Richter

Link to comment

Share on other sites

elabx

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Recently Browsing 0 members

Browse

Activity

My Activity Streams

Store

My Details

Support