NorbertH Posted September 6, 2018 Share Posted September 6, 2018 I had imported about 10000 Pages (realy simple ones 2 text fields only ) into PW using child batch editor and it took about 1 hour whith multiple script timeouts. Moved em to trash but emtying the trash run into timeout again after about 10 minutes. So i started a bash php script ($page->emptyTrash()) but after half an hour it is still running. Pagetree shows only about 4000 Removed from trash right now. ProcessWire 3.0.98 Running on InnoDB (Mariadb 10) Any Ideas ? Link to comment Share on other sites More sharing options...
bernhard Posted September 6, 2018 Share Posted September 6, 2018 Tracy debugger can help a lot monitoring such things. You have the log function where you can dump something to files. You can then follow in real-time what's happening. Eg l("removing page $page->id from trash"); and monitor the logfile, eg tail -f site/assets/logs/tracy/info.log Not sure if that path is correct ? You might also use flush the pages cache: https://processwire.com/api/ref/pages/uncache-all/ or something like https://processwire.com/talk/topic/7572-processwire-recipes/?do=findComment&comment=88505 Link to comment Share on other sites More sharing options...
flydev Posted September 7, 2018 Share Posted September 7, 2018 Not helping here but just saying, I imported recently more than 800k pages, it took about 16 hours ✌️ I can't remember how much pages I deleted from the trash, but it took the night.. 2 Link to comment Share on other sites More sharing options...
bernhard Posted September 7, 2018 Share Posted September 7, 2018 Wouldn't it have been easier in that case to first get all the id's of the pages that are in the trash and then loop all available tables in the database and remove all rows that have a pages_id of the trashed pages? Maybe it's not that easy, though. Not sure... I'd do a backup first ? ? Link to comment Share on other sites More sharing options...
Soma Posted September 7, 2018 Share Posted September 7, 2018 I have a ~40k CSV rows import/sync that runs for almost an hour. And it's not even importing, just syncing. If you use the API to load and cycle through lots of pages it is slow, there's no real way around it except doing it in SQL directly. Link to comment Share on other sites More sharing options...
netcarver Posted September 7, 2018 Share Posted September 7, 2018 @Soma I would expect a 40k row -> 40k page import to take a while - but a simple sync with only a low % of rows changed should be pretty fast. Of those 40k rows, approximately how many of them change between each cycle? I'd be tempted just to have a script do a diff between the last version and the current version and then only process the rows that have some kind of diff. Should be much faster that way if only a small set of rows change each cycle. Link to comment Share on other sites More sharing options...
Beluga Posted September 7, 2018 Share Posted September 7, 2018 On 9/6/2018 at 7:45 PM, NorbertH said: ProcessWire 3.0.98 Running on InnoDB (Mariadb 10) Any Ideas ? InnoDB is your (really PW's) problem. 3 Link to comment Share on other sites More sharing options...
Beluga Posted September 10, 2018 Share Posted September 10, 2018 Issue created 1 Link to comment Share on other sites More sharing options...
NorbertH Posted September 10, 2018 Author Share Posted September 10, 2018 Made a test installation whith MyIsam , was about 20% faster . Link to comment Share on other sites More sharing options...
NorbertH Posted September 10, 2018 Author Share Posted September 10, 2018 If i find some time i give it a try to import the pages whith transactions , maybe in 1000 entries blocks. Link to comment Share on other sites More sharing options...
NorbertH Posted September 18, 2018 Author Share Posted September 18, 2018 Hmm ok , did some testing . Testserver is a synology disk station whith 8 G ram. The import ccript is pretty simple : <?php namespace ProcessWire; include "index.php"; // Include PW include "berufe.php"; // Simple array with about 25000 Job names /* $berufe = array( array('id_beruf' => '1','berufsbezeichnung' => 'Aalbrutzüchter/in','kldb2010' => '11412'), array('id_beruf' => '2','berufsbezeichnung' => 'Aalfischer/in','kldb2010' => '11422'), array('id_beruf' => '3','berufsbezeichnung' => 'Aalräucherer/-räucherin','kldb2010' => '29242'), ... */ foreach ($berufe as $beruf){ echo $i++. " " .$beruf['kldb2010']." ".$beruf['berufsbezeichnung']."\n"; $p = new Page(); $p->template = 'berufsbezeichnung'; $p->parent = '/einstellungen/berufi/'; $p->title = $beruf['berufsbezeichnung']; $p->schluessel = $beruf['kldb2010']; $p->save(); } It took about 60 minutes to import using a PW installation whith Inno DB It took about 40 minutes using an installation whith MyIsam Now i changed the script to use transactions: <?php namespace ProcessWire; include "index.php"; // Include PW include "berufe.php"; // Simple array with about 25000 Job names try { $database->beginTransaction(); foreach ($berufe as $beruf){ echo $i++. " " .$beruf['kldb2010']." ".$beruf['berufsbezeichnung']."\n"; $p = new Page(); $p->template = 'berufsbezeichnung'; $p->parent = '/einstellungen/berufi/'; $p->title = $beruf['berufsbezeichnung']; $p->schluessel = $beruf['kldb2010']; $p->save(); } $database->commit(); } catch(\Exception $e) { $database->rollBack(); } On InnoDB using transactions the script finished in about 10 Minutes. It seemed like the actual DB operation that happened after creating the transaction only took a few seconds, as the almost script imediately ended after the last echo. Possibly transactions are worth a closer look, if we do massive data handling. 4 Link to comment Share on other sites More sharing options...
NorbertH Posted September 18, 2018 Author Share Posted September 18, 2018 Found a nice link about this topic: https://stackoverflow.com/questions/14675147/why-does-transaction-commit-improve-performance-so-much-with-php-mysql-innodb 1 Link to comment Share on other sites More sharing options...
Beluga Posted September 18, 2018 Share Posted September 18, 2018 Yep, been discussing transactions in the issue Link to comment Share on other sites More sharing options...
netcarver Posted September 18, 2018 Share Posted September 18, 2018 @NorbertH Out of curiosity, how long does your script take if you don't actually save the page? <?php namespace ProcessWire; include "index.php"; // Include PW include "berufe.php"; // Simple array with about 25000 Job names /* $berufe = array( array('id_beruf' => '1','berufsbezeichnung' => 'Aalbrutzüchter/in','kldb2010' => '11412'), array('id_beruf' => '2','berufsbezeichnung' => 'Aalfischer/in','kldb2010' => '11422'), array('id_beruf' => '3','berufsbezeichnung' => 'Aalräucherer/-räucherin','kldb2010' => '29242'), ... */ foreach ($berufe as $beruf){ echo $i++. " " .$beruf['kldb2010']." ".$beruf['berufsbezeichnung']."\n"; $p = new Page(); $p->template = 'berufsbezeichnung'; $p->parent = '/einstellungen/berufi/'; $p->title = $beruf['berufsbezeichnung']; $p->schluessel = $beruf['kldb2010']; //$p->save(); } ...and... <?php namespace ProcessWire; include "index.php"; // Include PW include "berufe.php"; // Simple array with about 25000 Job names try { $database->beginTransaction(); foreach ($berufe as $beruf){ echo $i++. " " .$beruf['kldb2010']." ".$beruf['berufsbezeichnung']."\n"; $p = new Page(); $p->template = 'berufsbezeichnung'; $p->parent = '/einstellungen/berufi/'; $p->title = $beruf['berufsbezeichnung']; $p->schluessel = $beruf['kldb2010']; //$p->save(); } $database->commit(); } catch(\Exception $e) { $database->rollBack(); } ...? Should give us a handle on the overhead of creating 25000 pages and setting the data on your test system. 1 Link to comment Share on other sites More sharing options...
NorbertH Posted September 20, 2018 Author Share Posted September 20, 2018 (edited) The variant whithout transactions (and whithout save()) took about 45 seconds to run. Using transactions (not really using them as save() is commented out ) took about a 45 seconds to run too. Both throw an error: PHP Warning: touch(): Utime failed: Permission denied in /volume1/web/personalverwaltung/wire/core/FileCompiler.php on line 1063 Edit: The error message has nothing to to whith the scripts , just an issue whith the test server. Edited September 20, 2018 by NorbertH New Insight... Link to comment Share on other sites More sharing options...
NorbertH Posted September 20, 2018 Author Share Posted September 20, 2018 Finally deleted all those pages using transactions: <?php namespace ProcessWire; include "index.php"; $dePages = wire('pages')->find("parent=/einstellungen/berufi/"); echo $dePages->count()."\n"; $i=0; try { $database->beginTransaction(); foreach ($dePages as $dePage){ echo $i++. " ".$dePage['title']."\n"; $dePage->delete(); } $database->commit(); } catch(\Exception $e) { $database->rollBack(); } Script took 7 Minutes to run , the actual query execution took about 7-10 seconds at the end of the script. I guess Transactions are helpfull at deleting pages too . What i did not try was to trash them first. 4 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now