Jump to content
Guy Verville

Over 50000 products to deal with

Recommended Posts

Hi,

We have a great project of a national website which will present over 50000 products and we are studying the possibility to use PW, following the success we had with a very much smaller project. While we know that PW has great search capabilities, we are looking at the same time at Elastic Search since there will be many angle to search from and we want the most natural, swift rendering. 

The products will be distributed over six main categories, and certainly three subcategories for each. Those categories will have their templates and the calls for the associated skus (50000 products) will be made à la Processwire. The website will be bilingual, and, on far future phase 2, there might be some ecommerce stuff (not all products).

I would like to know if any of you had an experience dealing with that amount of information for a website with PW, and if you could share some tips or describe some pitfalls to avoid...

 

  • Like 3

Share this post


Link to post
Share on other sites

Exciting time, congrats on the success of your previous project and starting this new project. I don't know if I can point to any pitfalls that are specific to processwire but more in relation to site structure as Processwire is extremely forgiving when it comes to adapting to various uses. I suppose that I would mainly focus on your layout as you would normally but take note on how you want to relate all of your items to one another and how you are going to present these items to your editors in a way that they can keep the site up to date reliably. In dealing with a product site you will want to be mindful of the items that you have on your site making sure that your structure doesn't create any bad habits in the people that update the site, like allowing editors to create duplicate items in different parts of your site and such.

I have to say though I know I thought similarly as you when I started a smaller sized site that I may need to use something like elastic search to search my site, but actually quickly abandoned this after I found the cache field. Using this field, add all the fields that you want to be searchable in a more general search, and then if you want to have a more granular product search, you can jump into the bread and butter of Processwire and search the fields specifically and sort the items as needed. I power my whole site off just the cache field and text searches. You can build a couple of these to handle different item types and just perform a search on a field directly without having to join all of the fields, this can really speed up your searches.

Interested to see how this progresses and hope you keep us updated on your progress :) 

  • Like 3

Share this post


Link to post
Share on other sites

I can second what @MuchDev said, but want to add that, for me, the cache field is/was yesterday. Today I only use the Textareas and Repeater Matrix from ProFields

Quote

ProFields are a group of ProcessWire modules that help you to manage more data with fewer fields,

I try to keep the amount of fields as less as possible. Results in quick fulltext search without the need of a cache field, what would double the content size in DB. Really worth to use. Saves time too.

  • Like 1

Share this post


Link to post
Share on other sites

Searching larger page sets can be critical in PW. I had experience with 100'000+ articles that searching 3-5 text fields suddenly takes 5-8 seconds (this is on a very fast server). Reducing it to 1 fields only would take 0.1s. But how would you go if you want to search multiple attributes and all. So for a product site I only have experience with 2-3000 artikels that already takes 2-3 seconds to search and an index ain't going to help here. So PW comes to a limit very quickly when searching large sets with multiple fields. It just creates lots of joins that can hurt your search. Also importing and handling such large sets can be very time consuming. I have a site where 45'000 members need to be snyched every day. Doing that with the API takes around 45m+ and takes lots of memory quickly. It has to deal with loading comparing, creating objects and may produce lots of work for the DB. Compared to have a CSV directly imported to a table would take a couple seconds.

  • Like 1

Share this post


Link to post
Share on other sites

Thank you all for your input. We will certainly continue our work with Elastic Search which is quite impressive (and RESTFUL). My programmer colleague found already that querying PW has its limit. I have asked him to look at your comments and maybe he will tell us about his appreciation of the topic.

  • Like 1

Share this post


Link to post
Share on other sites
9 hours ago, MuchDev said:

 I suppose that I would mainly focus on your layout as you would normally but take note on how you want to relate all of your items to one another and how you are going to present these items to your editors in a way that they can keep the site up to date reliably. In dealing with a product site you will want to be mindful of the items that you have on your site making sure that your structure doesn't create any bad habits in the people that update the site, like allowing editors to create duplicate items in different parts of your site and such.

I completely agree with you and that's why we are choosing PW for its UI. Structuring the data is also crucial as we will have to deal with a synchronization of data coming from another external DB.

  • Like 1

Share this post


Link to post
Share on other sites

(sorry for my English)

I believe that PW can handle a product project with a very large number of items. As @horst said I think that the most important part is to try to reduce the number of product fields (less tables to deal with).

If this can help, I read and did a few tests with Elastic Search and the PW module elastic-search. This is what I learn:

  1. Elastic Search should only be used as a search engine solution and not as a project database.
  2. Do not use Elastic Search if the product information is changed regularly to avoid the headache data management between PW, Elastic Search and some third-party system.

  3. Elastic Search gives a lot of tools to improve the reverence of search results. I never had the chance to try the FieldtypeCache module but I think is not that simple to control which items should be founded on a search.

  4. A Queue system and the elastic-search module hooks (create, update and delete) can help a lot with the synchronization of data coming from an external DB.

 

This is a really interesting topic thanks for sharing your experiences. 

  • Like 4

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...