Jump to content

Intermittent "MySQL server has gone away" errors on load balanced hits on plain home page


Recommended Posts

Hi Folks,

I've been trying to figure this one out for quite some time, with the help of my hosting company, but have not yet found a solution.

One of the accounts I work on is running PW v 2.3.0 on a load-balanced, two node Linux system.

The home page is deliberately plain, i.e. completely empty of content (since it's only a backend application with no front-end presence.)

I say that to clarify that the home page is tiny, with no extra database calls.

[edit: by backend app, I mean that it's a front-end app with a login. Only the home page and a login page are available to the guest user.]

The load balancer hits the home page on both nodes about every 30 seconds, as a health monitor check.

Since it hits "/", it's my assumption that it pulls the index.php file and displays the empty home page.

The home page is viewable by guests, which is reflected in the error message below.

For the last few weeks, I've received intermittent "MySQL server has gone away" errors, with the email coming from both of the nodes.

They tend to be clumped together, e.g. 5 or 10 in a few minutes, and then they stop.

The error messages come in from once a day to 4 or 5 times a day, and don't seem to have any correlation to server activity (that I've found so far.)

I naturally thought that the error was simply revealing that the connected MySQL cluster was having trouble, but the host admin did a lot of checking of the logs around the times of those errors and doesn't show any MySQL problem.

We've logged into the back end many times without any errors. The plain home page also doesn't show errors.

I don't know if it's a 'guest' session problem or not. I turned on debug mode and the message expanded to what I've pasted below.

Any brilliant, genius thoughts would be deeply appreciated.

Peter (see error message below)

=============

Page: /?/
User: ?

Error:
Exception: MySQL server has gone away

SELECT false AS isLoaded, pages.templates_id AS templates_id, pages.*, pages_sortfields.sortfield, (SELECT COUNT(*) FROM pages AS children WHERE children.parent_id=pages.id) AS numChildren,field_email.data AS `email__data`
FROM `pages`
LEFT JOIN pages_sortfields ON pages_sortfields.pages_id=pages.id
LEFT JOIN field_email ON field_email.pages_id=pages.id
WHERE pages.parent_id=29
AND pages.templates_id=3
AND pages.id IN(40)
GROUP BY pages.id  (in /home/user_account/public_html/wire/core/Database.php line 118)

#0 /home/user_account/public_html/wire/core/DatabaseQuery.php(84): Database->query(Object(DatabaseQuerySelect))
#1 /home/user_account/public_html/wire/core/Pages.php(319): DatabaseQuery->execute()
#2 /home/user_account/public_html/wire/core/PagesType.php(109): Pages->getById(Array, Object(Template), 29)
#3 /home/user_account/public_html/wire/core/Users.php(60): PagesType->get(40)
#4 /home/user_account/public_html/wire/core/Session.php(82): Users->getGuestUser()
#5 /home/user_account/public_html/
 

============

Link to comment
Share on other sites

Hello Peter,

 

PW 2.3.0 uses PDO and it seems that with some server configurations this can happen. There is a fix from Hari KT, that I have used instead of the existing WireDatabasePDO.php in a PW 2.3.5 Installation. And since then I haven't had that (rare) error anymore.

Ryan already has included it into the current Dev Branch (PW 2.4.1), too.

 

Maybe you like to try it.

 

WireDatabasePDO.zip

  • Like 2
Link to comment
Share on other sites

Dear Horst,

Thanks for this tip. I had read the post by Hari, and Ryan's response.

It seems to me that the reconnect to mysql code would come in handy, but it also seemed that it

was meant for situations where there was quite a lag between commands, like with thousands of

records being processed.

In my case, it was happening with load balancer hits against the index page.

Also, my version of PW didn't have the file 'WireDatabasePDO.php'. It just had Database.php.

But... your post, and Hari's comment on the GitHub page, https://github.com/ryancramerdesign/ProcessWire/pull/366,

stimulated my little grey cells, so I went and looked at the wait_timeout value.

It was set to 30 seconds, and the load balancer checks every 30 seconds, so I thought that might it.

So I raised the value to 300 seconds.

I'm hoping that will fix it. Crossing my fingers.

Thanks for your help!

Peter

  • Like 2
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...