Jump to content
michelangelo

Displaying thousands of records

Recommended Posts

Hello guys, I am building a sort of an archive. Relatively simple, although I have about 8000 records, each with 15 fields (text, int, images, url). I created a crude search system with a form (emulating the famous Skyscrapper example) to filter through the system. Everything works but it is quite slow... I have 2 questions which are related:


1. How can I search through the database?
2. What is a good practice to display many records like these?

-----------------------------------------

1. I am retrieving the results with

$songs = $pages->findMany('template=nk-song');

Then I do a foreach to render them all. I am unsure if that is a good way. If I render all of them on the page, it creates thousands of divs with a bit of text, and this can take a while (10s-15s).

 

2. This one is even worse :D as every time I retrieve my desired records with something like this:

$page->find("field_to_search_through~=my_query_string")

I get between 20 and 200, but when I render them I am creating iframes with YouTube videos and that can take up to 10s to finish. I "solved" it by only loading the iframes if they are in view with IntersectionObserver on the client-side. But I feel there is a more precise PHP / ProcessWire approach.
 

Just to clarify, I started doing all of this custom rendering and querying because tools like ElasticSearch or SearchEngine were a bit complicated and I needed a simple to retrieve information and then display it in my own way.

Thank you!

Share this post


Link to post
Share on other sites

Hi @michelangelo,

what's about pagination? As you mentioned the skyscraper site, if I remember right, it also uses pagination.

And besides pagination you should use caching. If possible, you should use ProCache. But there is also template caching and more available.

  • Like 2

Share this post


Link to post
Share on other sites

Hi @elabx, this looks great! I will see if I can implement it, it seems like just the right thing I need. Do I understand correctly that PW find() is not a similar to an SQL query?
Thank you, @horst for the suggestion, my design doesn't really take pagination into account. I am thinking of something more like Instagram scrolling, just simple results after each other.

And the caching... I am really overwhelmed how to use this. Do I cache every possible search result so it can be served quickly :D ? I really cannot understand what caching does. If it is a static page, I imagine that the server renders it and it saves it so it can serve it easily, but if I don't have a 'fixed' search result, how do I use caching? ProCache seems like a great tool, but I don't want to purchase it if I cannot make it work. What workflow would that entail?

Share this post


Link to post
Share on other sites
4 hours ago, michelangelo said:

Do I understand correctly that PW find() is not a similar to an SQL query?Do I understand correctly that PW find() is not a similar to an SQL query?

A ProcessWire find will perform an SQL query. The important thing (in terms of performance and tools like RockFinder) is what happens next. SQL queries are normally very fast. However, if you then load the results into Page objects, you might get a performance hit depending on the number of results you fetched. RockFinder does not load Page objects, hence remains very fast and performant. It is very much suited to handling large datasets.

  • Like 5

Share this post


Link to post
Share on other sites

Hi @michelangelo,

The fastest way to achieve the effect of Instagram feed, is to use Processwire Pagination with Ajax Infinity Scroll, for example https://infiniteajaxscroll.com

There is no silver bullet to take thousands of results form any kind of database. You should take it part by part and Processwire Selectors with Limits are good solution or prerender them as static data and keep them cached, as guys above said.

BTW, 1,000 divs on one page is realy bad idea in terms of browser capacability. Just open Instagram, scroll hounded posts and take a look how many resources it will takes from your console.

Good luck
Alexander

 

 

  • Like 6

Share this post


Link to post
Share on other sites

Thank you lads, sounds great. Now I got a better understanding. @Alexander yeah, 1000 divs is not the best :| I will make use of the template caching and just save the output of this page. And I maybe misrepresented the concept. It's just a list of all the 1000s of songs I have in my db. So technically it's more like a div with 1000s of spans or a's. Thanks for the ajax idea :))

Does the cache option on the templates relate to the $cache()? I enabled it for my templates and it seems to make everything faster, but I am randomly picking options (e.g. saving it for 1 hour!? ) and I am not sure how to continue...

  • Like 1

Share this post


Link to post
Share on other sites

Update on my process, @elabx I am trying to use RockFinder3 but I can't make it work. It seems like it's great for visualising results, but what I need is just an array or PageArray of my results which I can visualise. Also the SQL queries didn't work the way I intended... It's just too unclear to me how to use the module.

Share this post


Link to post
Share on other sites
6 hours ago, Alexander said:

The fastest way to achieve the effect of Instagram feed, is to use Processwire Pagination with Ajax Infinity Scroll, for example https://infiniteajaxscroll.com

Reading more on your project I think this is the right way to do it, RockFinder and RockGrid are very oriented in display the data in one shot using the browser for what I understand. You shouldn't really have that much issues using the built in pagination. I have a project where I paginate about 2 million pages and works (not blazing fast) but it does work at a reasonable speed. As @horst suggests, ProCache makes a whole world of difference too. 

 

  • Like 2

Share this post


Link to post
Share on other sites

RockFinder can massively reduce search times. I've used it to reduce complex searches from several seconds to fractions of a second.

However, if your searches aren't too complex, even on a fairly large number of pages the difference could be between part of a second using PW selectors and a very small part of a second using RockFinder. This isn't really going to solve your problem.

So I'd suggest sorting out the rendering first, as suggested by others, with RockFinder as a possible later refinement.

  • Like 4

Share this post


Link to post
Share on other sites

Thank you, @billH! That's sounds like a good plan! I think I get how PW selectors like find() works. I also tried direct SQL search, but I don't know where to direct my search to as the architecture of PW is not clear to me. Usually you select a table from the database and extract certain column or row, etc. but PW creates some crazy relations that really mess up my understanding.

 

Share this post


Link to post
Share on other sites
Just now, Sergio said:

There's also this cool module by @teppo that let you create a search index to speed up things immensely on some cases.

https://modules.processwire.com/modules/search-engine/

In the near future I plan to study on using Teppo's module to build an index directly to a Meilisearch instance running on the same server. I used it on a recent Laravel project and, although I only scratched the surface of its capabilities, the partial match and typo toleration is worth a try IMHO. Also it was a breeze configuring it.

  • Like 3

Share this post


Link to post
Share on other sites

Hey @Sergio Search Engine and Meilisearch sound amazing, but they always render a form in a particular way. My design is simple, but quite specific and I want to render the results right away (imagine Google Image search or YouTube video search). Those intermediate "results page" are in the way for my project. I can try again Search Engine but it was a bit hard for me...

Also because I just starting getting into PW PageArrays and those kind of data structures and Seach Engine gives me some JSON that I have to render 😝

Share this post


Link to post
Share on other sites

One thing is to load the search result in memory on PHP side on a $pages->find, another is render the HTML in the browser.

There's no way to render a HTML node tree of that size without some solutions like image lazy loading through a script, especially on mobile browsers. Nowadays it's partially supported natively, but for most cases you need to rely on Javascript. Google Images loads more images as you scroll the page, you can see this better if you slow down your connection using Chrome Devtools, for instance.

  • Like 3

Share this post


Link to post
Share on other sites
11 hours ago, michelangelo said:

Hey @Sergio Search Engine and Meilisearch sound amazing, but they always render a form in a particular way. My design is simple, but quite specific and I want to render the results right away (imagine Google Image search or YouTube video search). Those intermediate "results page" are in the way for my project. I can try again Search Engine but it was a bit hard for me...

Also because I just starting getting into PW PageArrays and those kind of data structures and Seach Engine gives me some JSON that I have to render 😝

Hey there! Just wanted to drop a quick comment 🙂

SearchEngine provides two built-in methods for search form rendering:

  • Highly customizable "basic form". You can probably get the best idea of this by taking a look at these lines in SearchEngine.module.php. I'd actually be surprised if there's something you can't customize.
  • Less customizable (unless you really enjoy hooking into form rendering) "native ProcessWire form". I know that some folks prefer these (I don't), so I thought it would be a nice addition.

... and, of course, you don't have to use either one. Bring your own markup. SearchEngine doesn't really care about the form — there's no CSRF or anything involved, it's just a search feature 🙂

As for the results list, there's the "easy way" where you let SearchEngine handle everything for you. Again, this is quite customizable via config settings and hooks, if you've got something specific in mind. JSON is also an option, but this is mostly intended for cases where you want to utilize SearchEngine as a part of an API, or perhaps create a fully customized JS search form feature.

The biggest things lacking form SearchEngine (in comparison to tools like Meilisearch, or more heavy-weight indexers such as Solr, Elasticsearch, RediSearch, etc.) are the ability to normalize queries, perform complex lookups from the index (SE is basically just using ProcessWire's selectors and that's it), smart weighting of results based on relevance (this is on my to-do list), and perhaps really snappy real-time results (you can perform asynchronous queries, but they go through your site, which means that they have some overhead; tools like Meilisearch can do this much more efficiently).

Anyway, just wanted to let you know that there are different options there. I don't actually know your case well and can't say if SearchEngine is the correct solution, but if it's UI level customization you're after, that shouldn't be an issue. And if it is, please let me know what's missing and I'll be happy to take a closer look; perhaps it's something that others might find useful as well 😛

  • Like 4

Share this post


Link to post
Share on other sites

@teppo thank you so much for the comment 🙏 I actually really enjoyed the module, I spoke too soon because I am a bit levitating between options at the moment. Now with your suggestions in mind, hopefully I can manage it better. I imagine that I can do a query like this, for example?

$pages->find('template=song, year>1990, year<=2020, search_index~=my_query_string')

 

Share this post


Link to post
Share on other sites
On 6/15/2020 at 9:05 PM, michelangelo said:

Thank you, @billH! That's sounds like a good plan! I think I get how PW selectors like find() works. I also tried direct SQL search, but I don't know where to direct my search to as the architecture of PW is not clear to me. Usually you select a table from the database and extract certain column or row, etc. but PW creates some crazy relations that really mess up my understanding.

 

There's no need at all to understand the PW architecture or database structure and relations - or even to think about these things.

Stick to using PW selectors (https://processwire.com/docs/selectors/) and all the hard work is done for you. Selectors are one of the really great features of PW!

(Note that if you later decide to use RockFinder for the extra speed, it also understands the PW database structure so that you don't have to.)

  • Like 4

Share this post


Link to post
Share on other sites
On 6/16/2020 at 12:36 PM, michelangelo said:

@teppo thank you so much for the comment 🙏 I actually really enjoyed the module, I spoke too soon because I am a bit levitating between options at the moment. Now with your suggestions in mind, hopefully I can manage it better. I imagine that I can do a query like this, for example?
 


$pages->find('template=song, year>1990, year<=2020, search_index~=my_query_string')

No worries, I get what you're saying 🙂

The way SearchEngine is built makes it (in my opinion) really simple to use when you basically just want a search feature and are happy with some visual modifications to the form and list of results etc. But once your needs get more complex, there's a bit of a learning curve there. I've tried to keep the complexity to bare minimum, but it can still be a bit overwhelming at the beginning, especially if you're not particularly familiar with the way ProcessWire itself works.

You can definitely do the kind of query you mentioned above with SearchEngine. At least for now the index field is just a regular FieldtypeTextarea (or FieldtypeTextareaLanguage for multi-lingual sites), which means that you can perform queries against it just like with your own custom fields.

On 6/16/2020 at 2:49 PM, BillH said:

There's no need at all to understand the PW architecture or database structure and relations - or even to think about these things.

Stick to using PW selectors (https://processwire.com/docs/selectors/) and all the hard work is done for you. Selectors are one of the really great features of PW!

+1 to this right there. Selectors are one of the biggest arguments for ProcessWire: not only do they make complex queries a breeze, but by sticking with them you can easily avoid some of the gotchas you'd have to worry about working with raw SQL (injection attacks, etc.)

Behind the scenes almost every field you see in the admin is what one might call a "custom field", and each of those has a database table of its own. ProcessWire makes extensive use of JOIN queries, which do indeed make things more complex if you're trying to write raw SQL, but again: this is something you should almost never do anyway 🙂

  • Like 1

Share this post


Link to post
Share on other sites
On 6/15/2020 at 3:10 PM, michelangelo said:

Update on my process, @elabx I am trying to use RockFinder3 but I can't make it work. It seems like it's great for visualising results, but what I need is just an array or PageArray of my results which I can visualise. Also the SQL queries didn't work the way I intended... It's just too unclear to me how to use the module.

Could you please elaborate what you mean by "can't make it work"?

Also the statement "seems like it's great for visualising results" shows that you may have misunderstood what it is or does, because it is really not for visualising data but for FINDING data 😉 It does exactly what you state you need: Finding data and returning a plain array in a performant way.

Visualization can then be done however you want. RockTabulator is a little cumbersome, but tabulator.info is great and you can easily build a custom integration that renders thousands of rows. You don't even have to use RockFinder as data source, you can also use a custom ajax endpoint to get junks of data: http://tabulator.info/docs/4.6/data#ajax-progressive

Many options, but I'd really be interested in the part "can't make RockFinder3 work" 😉 

  • Like 2

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By snobjorn
      I have a website with multiple content types that I want to be accessible through search. I really like the live search on processwire.com, that sorts content types while typing. I tried to find the code to recreate this, with no luck. Does anyone know if this is jquery, specific jquery plugins, json/xml cached files, and what kind of PHP code is used? Any tip that point me in the right direction would be much apperciated.
      The search result listing seems fairly easy to create with sorting through parameters.
    • By Anders
      I want to allow full text search on my site. There is a very nice solution that comes right out of the box:
      $selector = "title|body~=$q, limit=50"; This works, but to make it even better I would want to give higher weight to pages where the search term occurs in the title, than if it just occurs in the body. After all, a page with the title "Wine from France" is probably the best match for the search "france wine". How do I accomplish this in ProcessWire?
      I can see three possible paths, but I am not very fond of any of them:
      Do a direct SQL query, circumventing the API, along these lines. But I would prefer to abstract away the database layout if at all possible. Use something like ElasticSearch, but to be honest that would be to complicated to set up and maintain in the long run. Make multiple lookups, first for matches in the title, then for matches in the body, and merge and sort in PHP. My suspicion is that this would get complicated quite quickly. For instance, how do you deal with a page that has two of the three search terms in the title and the third in the body? Is there a magic option four I should look into? Or are any of the above options better than the others? Any input is welcome!
    • By jds43
      Hello,
      I have a search page loosely based on Skyscrapers where I'm parsing a selector with options 'beds', 'bathrooms', 'size' fields. It is working well until I select 'Any' after I've run a search. This is where no results are returned (/?beds=&bathrooms=&size=&submit=). I want it to reset and show all results.
      I hope this isn't too vague.
       
    • By sww
      Hey there,
      i am trying to add a kind of "intelligent" search.
      Is there any way to ignore punctuation and extra chars.
      e.g.
      When I type "somebody elses" that I still get the result "SOMEBODY ELSE´S CAR, 2005"
      The problem is the extra ´ … if I don't type it I don't get the result.
      So far I am using %=query (which isn't enough, I know)
      So the idea would be something like that:
      $selector = "sanitize(title|text)%=$q, limit=50";
      I know, it's not gonna work like that … just to illustrate.
      Best,
      Stefan
    • By dandeckr
      Hello!
      I'm making my way through my first encounter with ProcessWire, and I'm very near the end of my tasks! I've searched Google, these forums, and for add-ons, but I haven't found any documentation or reference work for implementing boolean searching in the native CMS? Am I missing a thread, add-on, or docs that can point the way? I'm sure I'm not the first to have a need like this? Thanks in advance! 
×
×
  • Create New...