double Posted November 26, 2020 Share Posted November 26, 2020 (edited) Can I import 8 million posts? Is it possible? Hardware requirements? I have 300 XML files (30k posts per file) SEO title Meta description H1 title Category Post content What is the fastest way to import? Edited November 26, 2020 by double Link to comment Share on other sites More sharing options...
elabx Posted November 26, 2020 Share Posted November 26, 2020 You can do this through a CLI script boostraping ProcessWire. It should be possible, I've imported millions of rows from a CSV using very little memory, although taking quite a bit of time. I'd recommend also using database transactions for saving batches of new pages. 1 Link to comment Share on other sites More sharing options...
double Posted November 26, 2020 Author Share Posted November 26, 2020 (edited) 3 minutes ago, elabx said: You can do this through a CLI script boostraping ProcessWire. It should be possible, I've imported millions of rows from a CSV using very little memory, although taking quite a bit of time. I'd recommend also using database transactions for saving batches of new pages. How fast will be site after importing 8 million posts? I tried to use Wordpress but it requires serious hardware to handle 8 million posts I have a VDS 4 Cores 4 GB memory and NVMe storage Edited November 26, 2020 by double Link to comment Share on other sites More sharing options...
netcarver Posted November 26, 2020 Share Posted November 26, 2020 1 hour ago, double said: How fast will be site after importing 8 million posts? That's basically impossible for anyone to answer as there are so many other variables involved than just the row count and your machine specs. It will also depend on how much of that data needs to be loaded per page view, how many requests per second you expect to handle, will you be using caching, are there background updates happening, are the tables correctly indexed and using the most suitable storage engine, how many sessions will be active at peak, will you be triggering external API calls as part of the page views, what about asset loading - all assets optimised, and how often you'll need to be updating rows in the DB, do the pages involve JS rendering anything on the frontend etc. etc. I think you'd be better off setting a target for acceptable page loading times and then asking "What do I need to do to get 80% of my page loads to this time or better?" You also need to consider if PW's API is a good fit for your programming needs and if the Admin interface is suitable for you and any users who may need access to the admin. I'd suggest setting your speed goals and then trying an import of a subset of your data and then seeing how your resource needs and page speeds scale going from say 100 thousand to 200 thousand rows and then extrapolating from that. If you do try out PW, please keep us updated with your results. 5 Link to comment Share on other sites More sharing options...
elabx Posted November 26, 2020 Share Posted November 26, 2020 1 hour ago, double said: How fast will be site after importing 8 million posts? @netcarver 's answer pretty much covers this. My advice would be to watch out for anything involving counting, such as pagination and complex selectors using the Selector API. 3 Link to comment Share on other sites More sharing options...
BillH Posted November 26, 2020 Share Posted November 26, 2020 One other suggestion: if any PW selectors turn out to be slow across so many records, try using the RockFinder3 module (https://github.com/baumrock/rockfinder3). 3 Link to comment Share on other sites More sharing options...
netcarver Posted November 26, 2020 Share Posted November 26, 2020 Ryan's ProCache module is the other obvious candidate to mention here. 5 Link to comment Share on other sites More sharing options...
bernhard Posted November 27, 2020 Share Posted November 27, 2020 This thread/post might also be interesting for you: 3 Link to comment Share on other sites More sharing options...
flydev Posted November 27, 2020 Share Posted November 27, 2020 On 11/26/2020 at 4:12 PM, double said: Can I import 8 million posts? Is it possible? Hardware requirements?I Yes, Yes and ?♂️ It should be a good task candidate for the following. ? Give a try to the modules developed by @mtwebit DataSet and Tasker Configure it and let the thing running the night or a good nap. A preview : 4 Link to comment Share on other sites More sharing options...
mtwebit Posted November 29, 2020 Share Posted November 29, 2020 I haven't used the DataSet XML import for a while. Let me know if it needs some polishing. 2 Link to comment Share on other sites More sharing options...
double Posted January 17, 2021 Author Share Posted January 17, 2021 On 11/26/2020 at 7:25 PM, netcarver said: If you do try out PW, please keep us updated with your results. I've added 1.5 million pages. It works faster than Wordpress, but server response time is still high - 0,8-2 s. I have 4 Cores, 4 GB RAM and SSD disk, Mysql 8 and PHP 7 I'll keep you updated 3 Link to comment Share on other sites More sharing options...
elabx Posted January 18, 2021 Share Posted January 18, 2021 On 1/17/2021 at 6:42 AM, double said: It works faster than Wordpress, but server response time is still high - 0,8-2 s. Are you doing any $pages->find() or get() on the pages with such performance? Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now