mrjasongorman

InnoDB Support in processwire

Recommended Posts

I'm sure there may have been a topic talking about this previously, but ironically i couldn't find it when using the search box in the forum.

Recently i've been using Processwire as more of a CMF really, which i've found it really easy to mould and shape to the application needs.

The one thing that concerns me though is the fact it only uses the MyISAM DB Engine.

These days InnoDB seems to be the standard, and seems to fit better with the way PW stores things, for example Foreign keys help the DB understand links of data across different tables (PW stores each field in a separate table). This would also benefit greatly from transactions, making sure that every SQL operation needed to store an item and its field data would either all be successful, or wouldn't happen at all. This gives guarantees that no data went missing during the save due to DB issues and crashes.

Row locking (InnoDB) rather than Table Locking (MyISAM) is a huge advantage, take the situation where i want to save an item that has a common field, like field_body, if i understand it correctly, then the field_body table would be locked on every read as MyISAM uses table locking, so other queries to read or write would have to wait until the table lock is released. InnoDB on the other hand only locks the row in question so other operations can happen to the table at the same time.

Another feature that goes along with the ACID compliance is The commit log, InnoDB keeps a commit log of transactions, so in the event of a crash it can recover to a consistent state. MYISAM however does not so it can be hard to know what state the data should be in, when recovering.

I think i read previously that Ryan chose MyISAM at the time for it's Full text search capabilities, which InnoDB only introduced in MySQL 5.6, but i think unless full text search is the key part to the internal PW system, then is it really that necessary a trade off?

I would rather the reliability of InnoDB storing data than to have full text search, for that kind of functionality i would use a separate system built for specifically with this feature, such as ElasticSearch.

So i was wondering, will InnoDB become the default DB engine for PW ??

  • Like 1

Share this post


Link to post
Share on other sites

I'm on mobile and not answering directly to your question, but want mention that instead of the forums search box, you should use google with site:processwire.com/talk as additional search tag.

Share this post


Link to post
Share on other sites

MySQL 5.6 is still far far away from broad adoption especially on shared hostings and fulltext search is essential to processwire's selector engine (e.g every search related to text in any form). Under this circumstances will InnoDB not become the default anytime soon. On the other hand is InnoDB already fully supported, but without using the fancyness of it in core functionality. You need to keep in mind that not everybody is using processwire as cmf (I do so as well), but it's still a simple cms like drumlapress to many people.

  • Like 1

Share this post


Link to post
Share on other sites

http://processwire.com/blog/posts/happy-new-year-heres-a-roadmap-for-processwire-in-2016/#what-else-is-coming-for-processwire-3.x-in-2016

More MySQL options. ProcessWire 3.x will provide more direct install-time options for selection of database engine and character set. We will likely be defaulting to use InnoDB (rather than MyISAM) when the conditions and environment support it.
  • Like 3

Share this post


Link to post
Share on other sites

Thanks for replying guys, i thought the reason for holding back might be something to do with the core. Looking forward to seeing the 3.x features!

Share this post


Link to post
Share on other sites

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By Mobiletrooper
      Hey Ryan, hey friends,
      we, Mobile Trooper a digital agency based in Germany, use ProcessWire for an Enterprise-grade Intranet publishing portal which is under heavy development for over 3 years now. Over the years not only the user base grew but also the platform in general. We introduced lots and lots of features thanks to ProcessWire's absurd flexibility. We came along many CMS (or CMFs for that matter) that don't even come close to ProcessWire. Closest we came across was Locomotive (Rails-based) and Pimcore (PHP based).
      So this is not your typical ProcessWire installation in terms of size.
      Currently we count:
      140 Templates (Some have 1 page, some have >6000 pages)
      313 Fields
      ~ 15k Users (For an intranet portal? That's heavy.)
      ~ 195 431 Pages (At least that's the current AUTOINCREMENT)
       
      I think we came to a point where ProcessWire isn't as scalable anymore as it used to be. Our latest research measured over 20 seconds of load time (the time PHP spent scambling the HTML together). That's unacceptable unfortunately. We've implemented common performance strategies like:
      We're running on fat machines (DB server has 32 gigs RAM, Prod Web server has 32gigs as well. Both are running on quadcores (xeons) hosted by Azure.
      We have load balancing in place, but still, a single server needs up to 20 sec to respond to a single request averaging at around about 12 sec.
      In our research we came across pages that sent over 1000 SQL queries with lots of JOINs. This is obviously needed because of PWs architecture (a field a table) but does this slow mySQL down much? For the start page we need to get somewhere around 60-80 pages, each page needs to be queried for ~12 fields to be displayed correctly, is this too much? There are many different fields involved like multiple Page-fields which hold tags, categories etc.
      We installed Profiler Pro but it does not seem to show us the real bottleneck, it just says that everything is kinda slow and sums up to the grand total we mentioned above.
      ProCache does not help us because every user is seeing something different, so we can cache some fragments but they usually measure at around 10ms. We can't spend time optimising if we can't expect an affordable benefit. Therefore we opted against ProCache and used our own module which generates these cache fragments lazily. 
      That speeds up the whole page rendering to ~7 sec, this is acceptable compared to 20sec but still ridiculously long.
      Our page consists of mainly dynamic parts changing every 2-5 minutes. It's different across multiple users based on their location, language and other preferences.
      We also have about 120 people working on the processwire backend the whole day concurrently.
       
      What do you guys think?
      Here are my questions, hopefully we can collect these in a wiki or something because I'm sure more and more people will hit that break sooner than they hoped they would:
       
      - Should we opt for optimising the database? Since >2k per request is a lot even for a mysql server, webserver cpu is basically idling at that time.
      - Do you think at this point it makes sense to use ProcessWire as a simple REST API?
      - In your experience, what fieldtypes are expensive? Page? RepeaterMatrix?
      - Ryan, what do you consider as the primary bottleneck of processwire?
      - Is the amount of fields too much? Would it be better if we would try to reuse fields as much as possible?
      - Is there an option to hook onto ProcessWires SQL builder? So we can write custom SQL for some selectors?
       
      Thanks and lots of wishes,
      Pascal from Mobile Trooper
       
       
    • By Sergio
      All of a sudden, with nothing changed on the database or server, a website was getting error when doing a search:
      Error: Exception: SQLSTATE[HY000]: General error: 23 Out of resources when opening file './your-database-name/pages_parents.MYD' (Errcode: 24 - Too many open files) (in /home/forge/example.com/public/wire/core/PageFinder.php line 413) #0 /home/forge/example.com/public/wire/core/Wire.php(386): ProcessWire\PageFinder->___find(Object(ProcessWire\Selectors), Array) #1 /home/forge/example.com/public/wire/core/WireHooks.php(723): ProcessWire\Wire->_callMethod('___find', Array) #2 /home/forge/example.com/public/wire/core/Wire.php(442): ProcessWire\WireHooks->runHooks(Object(ProcessWire\PageFinder), 'find', Array) #3 /home/forge/example.com/public/wire/core/PagesLoader.php(248): ProcessWire\Wire->__call('find', Array) #4 /home/forge/example.com/public/wire/core/Pages.php(232): ProcessWire\PagesLoader->find('title~=EAP, lim...', Array) #5 /home/forge/example.com/public/wire/core/Wire.php(383): ProcessWire\Pages->___find('title~=EAP, lim...') #6 /home/forge/example.com/public/wire This error message was shown because: you are logged in as a Superuser. Error has been logged.  
      I tried several things, listed in this thread: https://serverfault.com/questions/791729/ubuntu-16-04-server-mysql-open-file-limit-wont-go-higher-than-65536
      But for some reason, MySQL was not getting its limit increased, but in the end, the one that did the trick was this:
      This worked for me on Ubuntu Xenial 16.04:
      Create the dir /etc/systemd/system/mysql.service.d
      Put in /etc/systemd/system/mysql.service.d/override.conf:
      [Service] LimitNOFILE=1024000 Now execute
      systemctl daemon-reload systemctl restart mysql.service Yes indeed, LimitNOFILE=infinity actually seems to set it to 65536.
      You can validate the above after starting MySQL by doing:
      cat /proc/$(pgrep mysql)/limits | grep files
    • By bot19
      I've never encountered this issue before. My local installing is using AMPPS 3.7, setup about 3 months ago.
      Everything was working fine last night, and I think possibly this morning, before it complete stopped with these errors. Please see below.
      I have not changed any config setttings, access settings.
      The only 2 other things I did yesterday was log into phpmyadmin and export the DB, then I think an Xcode update was installed yesterday.
      Can anyone help?
      I know people have run into these issues when they migrate, but I haven't migrated, setup, changed anything.
      Some people have said to just disable to do something like:
      $cfg['Servers'][$i]['AllowNoPassword'] = true; In phpMyAdmin's config.inc.php file. I tried but that didn't work.
      I even tried changing:
      $config->dbHost = 'localhost'; to 127.0.0.1, in site/config.php but no good.


    • By creativejay
      I'm displaying a list of products which are found by their templates, but the pages are taking a very long time to load. At first, I blamed it on my image rendering (using PIM2), but even with all those images now stored in the file tree, the page is taking abysmally long to load. ProCache seems to help but I don't feel as though what I'm trying to do should be gnawing the bones of my resources quite so long.
      The variable for the selector is defined in my header include:
      $productCatList="prod_series|prod_series_ethernet|prod_series_access|prod_series_accessories|prod_series_fiber|prod_series_pwr_supplies|prod_series_pwr_systems|prod_series_wireless"; $getCurrentProdOptions="template=$productCatList, prod_status_pages!=1554|1559|1560|4242"; Then in the template for the page upon which the directory loads:
      $products = $pages->find("$getCurrentProdOptions"); include_once("./prod-list-row.inc"); echo $out; And the prod-list-row.inc foreach (which is on every page that's exhibiting the slowdown):
      <?php $sum = 0; $out =""; $out .= "<div class='span_12_of_12'>\n"; foreach($products as $p){ $sum += 1; if ($sum % 2 == 0) { $bgcolor = '#fff'; } else { $bgcolor = '#e4e4e4';} $par = $p->parent; $out .="<div class='section group' style='background: $bgcolor ; min-height: 110px'>\n"; $img = $p->prod_image; $thumb = $img->pim2Load('squarethumb100')->canvas(100,100,array(0,0,0,0),'north',0)->pimSave()->url; $out .="<div data-match-height='{$p->title}' class='col span_2_of_12 hide'>"; $out .="<a href='{$p->url}'><span class='product-image-box'><img src='{$thumb}' alt='{$p->title}' title='{$p->title}'></span></a>"; $out .= "</div>"; $out .= "<div data-match-height='{$p->title}' class='col span_6_of_12'>"; $out .= "<div class='prod-list-name-label'><a href='{$p->url}'>{$p->title}</a></div>"; if($page!=$par) { $out .= "<div class='prod-list-category-label' style='font-size: .7em;'>Category: <a href='{$par->url}'>{$par->title}</a></div>"; } $out .= "<div class='list-headline' style='font-size: .8em;'>{$p->headline}</div>"; $out .="<div class='learn-more-buttons-sm'>"; $out .="<a href='{$p->url}' title='Product Specs and Documentation'><span class='find-out-more-button' style='font-size: .8em;'><i style='font-size: .8em;' class='fa fa-lightbulb-o' ></i> &nbsp; Learn More</span></a>"; $out .="</div>"; $out .="</div> \n"; $out .= " <div data-match-height='{$p->title}' class='col span_4_of_12'>"; if(count($p->prod_feat_imgs) >0 ){ $out .= "<div class='featured-icons-list' margin: 2em .5em;'>"; foreach($p->prod_feat_imgs as $feat){ $icon = $pages->get("$feat->prod_featicon_pages"); if($icon->image) { if($feat->prod_feat_textlang) { $icontitle = $feat->prod_feat_textlang;} else {$icontitle = $icon->title;} $out .= "<img src='".$icon->image->size(35,35,$imgOptions)->url . "' alt='" . $icontitle . "' title='" . $icontitle . "' class='listing-feat-icon' style='margin-right: .5em;' />"; } } $out .= "</div>"; if($p->prod_product_line){ foreach($p->prod_product_line as $pline) if($pline->image) { $out .= "<div style='height: 35px;'>\n"; $out .= "<img src='{$pline->image->size(75,35,$imgOptions)->url}' alt='{$pline->title}' />"; $out .= "</div>"; } } } $out .= "</div>"; $out .="</div>"; } $out .= "</div>";  
      Is there a clear culprit here of what I'm doing that's so stressing the system?
      I turned off TracyDebugger because I saw another thread about that causing slowdown (even though I'm using the latest), but that had no effect. Every time I thought I found the culprit and commented it out, nothing changed.
      Would appreciate some more eyes on this. Thank you!
      ETA: prod_feat_imgs is a repeater field which contains a Page reference field (from which I pull the image and title) and a multilanguage textfield (to override the page reference title if it exists). Could that be the problem?
    • By Mirza
      I have built a system in processwire, which has more than 600K pages.
      A team of 40 people is using the system, DB is from AWS with 16GB Ram.
      But still, select queries are getting locked.
      It would be great if someone suggests how to solve this problem.
      Also note: We have around 48 fields in one template.
      Thanks in advance.