formulate Posted April 17, 2020 Share Posted April 17, 2020 I'm developing a web app in PW and have a lot of PW experience. However, this particular project will be very large scale and I have some questions. The "app" will be creating approximately 3000 pages per day at launch and continue to grow with an expected 300,000 pages per day after a few months. As you can see, even after half a year I will be over 100+ million pages. Frankly, this seems ridiculous, but it's the case. 1. Can ProcessWire even handle this? 2. Does it just come down to server capabilities? 3. Should I consider trying to break this down to separate multiple databases? 4. Alternately, instead of Pages, should I look at using fields instead? Maybe storing JSON in a text field? This would reduce the amount of pages to less than 100 per day, even after half a year. I presume the database itself would still get very large. Thoughts? Thanks. Link to comment Share on other sites More sharing options...
Pixrael Posted April 17, 2020 Share Posted April 17, 2020 For the storage of data that does not change anymore after saving it, (I mean it's not necessary to edit and save constantly) I use the module Fieldtype YAML. It's very easy to read the data and to save it too. You can save simple objects or an entire structure in the field, however you want to do, only using an associated array. For saving part check this post: 1 Link to comment Share on other sites More sharing options...
elabx Posted April 17, 2020 Share Posted April 17, 2020 I think it depends a bit on how you display the data. I'm doing a 2+ million pages and I'm struggling a lot with queries. But displaying pages of the data is not that bad. Searching on one field, such as a title field is somehow fast (a 5- 10 seconds). Trying to do something like the following can be really slow: $pages->find("template=blog_post, page_reference_field1.title|title|page_reference_field2.title%='something I want to match'"); I'm using a lightsails erver 2vcpu, 4gb ram. Don't really know if using a dedicated db server would help. Link to comment Share on other sites More sharing options...
formulate Posted April 17, 2020 Author Share Posted April 17, 2020 Thanks to both of you for the feedback. Maybe I'm approaching this wrong. All I'm storing is time stamps. There's a hierarchical organization of pages and within these, a need to store time stamps. I considered JSON at a lower sub-level of pages, but even then, the JSON would get too large for the MySQL character limit for the text field. Also, the JSON would become time consuming to process and work with. Is there a better way of storing millions of time stamps that I'm not thinking of? Link to comment Share on other sites More sharing options...
Pixrael Posted April 17, 2020 Share Posted April 17, 2020 Uf! I don't know, but you can implement a module for this https://cloud.google.com/bigquery and then share it here! ? Link to comment Share on other sites More sharing options...
Robin S Posted April 17, 2020 Share Posted April 17, 2020 4 hours ago, formulate said: All I'm storing is time stamps. There's a hierarchical organization of pages and within these, a need to store time stamps. I think custom database tables and SQL queries would be the way to go. This is a good read on working with hierarchies in MySQL: http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/ 3 Link to comment Share on other sites More sharing options...
bernhard Posted April 18, 2020 Share Posted April 18, 2020 21 hours ago, elabx said: I'm doing a 2+ million pages and I'm struggling a lot with queries. Not sure if that would help, but I'm curious: Have you ever tried RockFinder2 ? ? +1 for custom db tables and custom SQL. RockFinder2 makes combining custom SQL + PW magic (like access control, hidden/published pages, pagination etc) really easy! 1 Link to comment Share on other sites More sharing options...
elabx Posted April 18, 2020 Share Posted April 18, 2020 1 hour ago, bernhard said: +1 for custom db tables and custom SQL. RockFinder2 makes combining custom SQL + PW magic (like access control, hidden/published pages, pagination etc) really easy! Gonna try this ASAP. Thanks! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now