Jump to content

How to handle large volumes?


eincande
 Share

Recommended Posts

Hello everyone,

I have a few questions about the best way to handle some big amount of datas in PW.

I have to store about 50 000 rows and create a search engine accordingly. (Something pretty similar to your skycrappers demo )

I consider using the PW metamodel to store my data structure and manage it within the admin section to stick to PW philosophy, rather than having it in a separate schema and access it directly in PHP.

- Do you have any ideas/feedback on how does the metamodel underlying PW would performs with numerous datas on conccurent searches?

- Is it possible and can you point me in the right direction to create an admin section that would looks like the default users management, with tabular datas display and filter options ?

- How can i hide those numerous "pages" to appears in the pages admin section ? It would be redondant with an dedicated admin module IMO.

- I think i red on another thread that when using PW API to access/search a table, it load the whole table in memory. Did i get it wrong ?

Thanks in advance for any help :)

Emmanuel

PW Newbie

Link to comment
Share on other sites

Hi Emmanuel, its's very easy to convert data to PW if you start thinking in tree instead of table structure. Let's say json/mongo vs mySQL/excel. One important thing to have in mind is that besides the parent child structure you have also the possibility to connect the branches of the tree with page fields, which makes it very flexible. I think the best way to structure data is to convert 1 to 1 relations as parent/child connection, and 1 to many relations as page field connections.

When you query the database with PW, the "pages" are stored in a pageArray object. Obviously, if you have lots of data, this will empty your memory very quickly. That's why you should always limit your queries ("limit=50") and paginate the results when needed. 

  • Like 2
Link to comment
Share on other sites

Thank you for that quick answer :)

Maybe i didn't express myself clearly: I have no problem to translate my datas to PW data model.

My concern is about  performance of it all on potential high traffic website.
Generaly, metamodels are great in design but lack performances.


 

When you query the database with PW, the "pages" are stored in a pageArray object. Obviously, if you have lots of data, this will empty your memory very quickly. That's why you should always limit your queries ("limit=50") and paginate the results when needed. 

You are right ! :) But i red somewhere (can't put my hand on it anymore), that some API request loads the whole dataset instead of limiting. But i must have misunderstood.. ;)

Link to comment
Share on other sites

Ok, maybe I misread your question.

There is one thing that I didn't answer. You can hide these pages from editor by putting them as children of a page under the "admin" page. They will also be hidden from searches, but if you call them like this:

-admin

--datapages (id:123)

----somedata

----moredata

$pages->get(123)->find("selector")
  • Like 1
Link to comment
Share on other sites

Hi Emmanuel, ts's very easy to convert data to PW if you start thinking in tree instead of table structure. Let's say json/mongo vs mySQL/excel. One important thing to have in mind is that besides the parent child structure you have also the possibility to connect the branches of the tree with page fields, which makes it very flexible. I think the best way to structure data is to convert 1 to 1 relations as parent/child connection, and 1 to many relations as page field connections. 

Hey diogo!

Great way to explain the ProcessWire data structure by making analogies to universal data-relations concepts!

Thanks,

Matthew

  • Like 1
Link to comment
Share on other sites

My concern is about  performance of it all on potential high traffic website.

Generaly, metamodels are great in design but lack performances.

It just depends on what your expectations are. I think most agree PW provides quite good performance in this area, and perhaps better than the other CMS platforms out there. But if performance is the primary factor, you can do no better than to work directly with the database and well defined indexes. Abstracting the database is always a compromise for performance, but definitely a worthwhile one here, in my opinion. So long as you use scalability best practices (especially use of pagination) I think you'll find PW to provide very good performance regardless of scale. 

You are right !  :) But i red somewhere (can't put my hand on it anymore), that some API request loads the whole dataset instead of limiting. But i must have misunderstood.. 

No, when it comes to pages or any data connected with pages, PW does not load the whole dataset unless you tell it to (by omitting a limit=n in your selector). PW is designed to be scalable regardless of page quantity. Though for most installations, this is generally under 100k pages. However, I do know of at least one person using PW with more than 100k pages, and all is good. We're now talking about using PW for millions of pages, but I don't think that's been tested to date. However, I fully expect PW can support that scale. 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...