Jump to content

CSV page import performance (35k pages, testcase included)


Beluga
 Share

Recommended Posts

I have a problem with the performance of importing CSV data with ~35k rows and 17 columns as pages. This is all on my local dev environment, PW 3.0.42, Ubuntu 16.10 virtual machine with SSD, 16GB memory and 4 CPU cores allocated (6th gen i7). PHP is 7.1 and MariaDB is 10.1.21. So the setup if beefier compared to the server environment this project would be living in, at least in the CPU department.

So far I have tried with Ryan's CSV import module. I started the import at 14:30, left it running as I went to sleep. When I woke up and checked it, Apache server had somehow borked. 14k pages existed. I tried to restart apache2, but it coredumped. After a restart of the VM, Apache worked again.

If we assume the import ran for 10 hours (might have run longer), that's about 23 pages per minute. Seems it could go faster.

Next I will test with @adrian's Batch Child Editor.

If someone wants to conduct their own testing, attached is a zip file with

- real world CSV data, from a proverb research database, Tab separated and enclosed with ^ characters (because Ryan's module doesn't work with pure Tab separated). I have added dummy "title" field data, so it can be used with any of the importers without hacks.

- the needed fields exported from PW

- my php.ini

- my my.cnf

csv-testcase.zip

Edited by cstevensjr
Moved from Modules/Plugins Section
Link to comment
Share on other sites

Fantastic work, guys. Zeka's result proves that my local setup is somehow silly. Maybe it is too primitive: I don't even use php-fpm.

However, you can imagine my surprise, when I returned back home just now after leaving the BCE import running: it had imported all the 35k pages successfully!

I should set my local environment up like I have on the server, with php-fpm and Caddy.

Thank you for restoring my sanity and trust in PW.

Link to comment
Share on other sites

Glad BCE worked for you :)

I am curious though what happened for you with Ryan's module. My local dev setup is definitely nothing special. A 2011 MacBook Pro with 8GB RAM and PHP 7.1 all setup via HomeBrew. I feel like something else must be amiss. I have never used MariaDB - I know it is supposed to be faster if anything and completely compatible with MySQL, but who knows?

  • Like 1
Link to comment
Share on other sites

Ok, I set up a Docker dev environment with Caddy, php-fpm and MariaDB and now I could import the whole 35k rows with Ryan's module in about 2 hours. Still, that is much worse than Zeka's result. Maybe I have to tweak some php-fpm and MariaDB settings.. I just don't know how exactly.

Link to comment
Share on other sites

I realized tuning php-fpm is useless as it only helps when dealing with multiple users. I tuned MariaDB instead. I had created the PW install with InnoDB.

Sadly, it took an equivalent amount of time to import with Ryan's module. Even emptying the trash with 35k pages took 30 minutes :( and I always get "Connection to server lost" while doing either import or trashing. It then does not allow me in the admin until it has finished the operation.

Next I will try with BCE.

Here is my override for my.cnf:

Spoiler

[mysqld]
innodb_buffer_pool_size = 16G # (adjust value here, 50%-70% of total RAM)
innodb_additional_mem_pool_size = 16M
innodb_log_file_size = 256M
innodb_flush_log_at_trx_commit = 1 # may change to 2 or 0
innodb_flush_method = O_DIRECT
innodb_file_per_table=ON
innodb_stats_on_metadata = OFF
innodb_buffer_pool_instances = 8 # (or 1 if innodb_buffer_pool_size < 1GB)
query_cache_type = 0
query_cache_size = 0
innodb_autoinc_lock_mode = 2
innodb_io_capacity_max=6000
innodb_io_capacity=3000
key_buffer_size = 1024M
max_allowed_packet = 1024M
table_open_cache = 2000

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...