Jump to content

MySQL GROUP BY - error help


SamC
 Share

Recommended Posts

Not processwire specific, this is to do with what I've been learning in MySQL today. I can't get passed this error and I have no idea what it means. I looked up the docs (and googled the error) but tbh, I don't understand any explanations either. Made a table and tried to use GROUP BY like so, I'm probably missing the point of GROUP BY so any advice would be awesome (there's extra steps in here so I can test out how to insert/alter etc. it's the error that's throwing me):

DROP TABLE golfscores;

CREATE TABLE golfscores (
    id INT AUTO_INCREMENT PRIMARY KEY,
    member_name CHAR(255),
    member_score INT
);

INSERT INTO golfscores (member_name, member_score)
VALUES ("Sam", 30);

INSERT INTO golfscores (member_name, member_score)
VALUES ("Ben", 28);

INSERT INTO golfscores (member_name, member_score)
VALUES ("Felicia", 26);

INSERT INTO golfscores (member_name, member_score)
VALUES ("Sam", 38);

ALTER TABLE golfscores
ADD member_id INT
AFTER member_name;

UPDATE golfscores
SET member_id = 222
WHERE member_name = "Sam";

UPDATE golfscores
SET member_id = 333
WHERE member_name = "Ben";

UPDATE golfscores
SET member_id = 444
WHERE member_name = "Felicia";

SELECT *
FROM golfscores;
id   member_name    member_id    member_score
1    Sam            222          30
2    Ben            333          28
3    Felicia        444          26
4    Sam            222          38

So far, so good! Now when I try and run:

SELECT *
FROM golfscores
GROUP BY member_id;

I get the error:

Quote

 

16:52:51    SELECT * FROM golfscores GROUP BY member_id LIMIT 0, 1000

Error Code: 1055. Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'phpdb.golfscores.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by    0.000 sec

 

Can anyone (attempt to) tell me what this means please (in plain English)?

Thanks.

Link to comment
Share on other sites

This means that your MySQL server is in strict mode regarding group by expressions. MySQL used to be rather lenient there, but normally standard SQL requires that you name all non-aggregate (SUM, COUNT, AVG etc.) columns that appear in the SELECT clause also in the GROUP BY expression. Since along version 5.6, the ONLY_FULL_GROUP_BY settings was enabled as a default, and nowadays many hosters and Linux distributions also set this flag in installations of 5.5.

Your "SELECT *" include the columns id, member_name, member_id and member_score, so for MySQL it looks like:

SELECT id, member_name, member_id, member_score
FROM golfscores
GROUP BY member_id
LIMIT 0, 1000;

To understand why MySQL complains, you have to understand what MySQL does under the hood:

  1. It scans through the member_id column (or its index) for all unique values.
  2. For any  of the other columns, since they aren't in the group by, it picks the first value it finds in a row found in the first step.
  3. This means, in your example data, it might find a member_score of either 30 or 38 for member_id 222. Which one it finds first and returns is pure chance.

Since the SQL standard desires predictability and doesn't like chance at all, such a behavior is unacceptable. Very few databases allow such a thing at all.

On the other hand, if you, as the developer, know that mismatched rows like in your example cannot be found in the database (i.e. only one combination for id, member_name and member_score is in the table for each member_id), grouping just by member_name tells MySQL "don't worry about checking uniqueness and order of the other columns, just return the first hit for each member_id" and saves the database server (sometimes quite) a bit of lookup work for those columns.

This is, of course, a bit dangerous.

Thus, the ONLY_FULL_GROUP_BY setting was switched on at some point. You can switch it off per connection, or globally for all new connection through a SET statement, but these won't be persisted (PW does this every time it initializes a database connection). If you want to use that optimized group by syntax, the best way is to do it in my.cnf (the server configuration). Somewhere there, you will find line in the [mysqld] section with the sql_mode. Remove ONLY_FULL_GROUP_BY from that line.

/* MySQL < 5.7.x */
sql_mode = NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES,ONLY_FULL_GROUP_BY

/* newer MySQL */
sql-mode="NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES,ONLY_FULL_GROUP_BY"

Make sure to keep the format (underscore vs. dash, no quotes vs. double quotes) the same and restart the server.

  • Like 4
  • Thanks 1
Link to comment
Share on other sites

From what I understand, you are attempting to group the resulting data by a column that is not unique; There are two entries of 222, which confuses mysql because it doesn't know which one to use for the group.

Why are you wanting to group results of this particular query? I understand that you are learning sql, but this example isn't the best option to learn GROUP BY.

[edit]
@BitPoet replied.
The only concern I have is disabling a warning rather than fixing an issue.

I would rather use any_value(column_name) to tell mysql to pick one.

  • Like 1
Link to comment
Share on other sites

33 minutes ago, rick said:

Why are you wanting to group results of this particular query?

It's in my book, he uses GROUP BY on a similar table (with some output though, not an error). I think it's painfully clear I had no idea what GROUP BY does! Lol.

However, thanks for the replies. I don't want to disable warnings, rather write decent code. @BitPoets explanation made things a lot clearer!

  • Like 1
Link to comment
Share on other sites

1 hour ago, SamC said:

It's in my book, he uses GROUP BY on a similar table (with some output though, not an error)

The author may be using a previous mysql version that did not have this option enabled by default.

I was just curious because I couldn't see a instructional site/book giving you wrong information. And it doesn't matter at all whether you know what group by does. The point is that you are wanting to learn everything you can. We all can learn something new regardless of experience.

  • Like 1
Link to comment
Share on other sites

Just an FYI that PW actually disabled the strict setting to avoid this error. This is something that I think should be changed in the future.

I dealt with upgrading a lot of complex queries on a non-PW project a while ago and it can be a pretty painful migration - in some cases the results can actually be quite different so some caution is definitely needed.

  • Like 2
Link to comment
Share on other sites

14 hours ago, rick said:

I was just curious because I couldn't see a instructional site/book giving you wrong information.

I should have mentioned that he shows this example as a way of showing how the results of this query are undesirable (i.e. only prints first value it finds). Except in my case, it just s***s the bed and complains.

He continues to show that GROUP BY would normally be used with MySQL functions, MIN(), MAX(), COUNT(), SUM(), AVG(), NOW(), CONCAT() and UNIX_TIMESTAMP().

Guess what I'll be doing today :rolleyes:

Link to comment
Share on other sites

That is exactly what @BitPoet said about the aggregate functions.

I'm wondering if you can view the various PW queries via Tracy as further examples while studying. It would help to see actual data being used. Applying your instructions to actual content would help with understanding how PW works internally. For example, Seeing how PW creates queries from selectors.

Link to comment
Share on other sites

2 minutes ago, rick said:

That is exactly what @BitPoet said about the aggregate functions.

I'm wondering if you can view the various PW queries via Tracy as further examples while studying. It would help to see actual data being used. Applying your instructions to actual content would help with understanding how PW works internally. For example, Seeing how PW creates queries from selectors.

Yeah totally! I saw that suggested by @adrian in the other MySQL thread. It's gonna be very interesting to see what my $pages->find() are actually doing.

  • Like 1
Link to comment
Share on other sites

On 1/28/2018 at 2:44 AM, SamC said:

Yeah totally! I saw that suggested by @adrian in the other MySQL thread. It's gonna be very interesting to see what my $pages->find() are actually doing.

I have just been playing around with the "Selector Queries" section of the Tracy Debug Mode panel. Now it adds the SQL Query column to show you the query that was used for each of the selectors called in generating the current page. I am not yet convinced it's perfect for all selectors, so I haven't committed it yet, but if anyone would like to try it out, I can either post here or push to a dev branch.

Also, any thoughts on this? How useful it is, and if you have ideas for making it more useful.

image.thumb.png.4e36a74cb0882b4f70b6474a136c5507.png

  • Like 4
Link to comment
Share on other sites

9 hours ago, adrian said:

Also, any thoughts on this? How useful it is, and if you have ideas for making it more useful.

This does look really useful if it gets "perfect for all selectors", since it is a great way to get a different point of view of the selectors we write. For example I am very slow at writing SQL since I rarely have to do it (thanks to ProcessWire :) ) but I can read it so this means I can more easily identify errors in complex selectors just by taking a look at the the actual queries. Also, spotting query related performance issues gets easier. I'm thinking of seeing lots of JOINs and being able to reduce them should they only be a result of a not well written selector. At least this is my understanding right now. Correct me if I'm mistaken...

  • Like 1
Link to comment
Share on other sites

13 hours ago, adrian said:

I am not yet convinced it's perfect for all selectors, so I haven't committed it yet, but if anyone would like to try it out, I can either post here or push to a dev branch.

I'm writing a tutorial on using tracy soon, started it the other day. It's a chance to finally get my feet wet with it myself, and to share whatever I learn along the way. I'd give the dev branch a go.

  • Like 2
Link to comment
Share on other sites

Just a quick note. Whilst it's ok to do a "Select * from ..." query whilst you're debugging or developing, you shouldn't use the * clause in your final production version.

Why? 
1) Efficiency. You may not need all the fields from the table(s) you're selecting from. Reducing the fields will make your query return quicker.
2) If 1 of the underlying tables change in the future,  "Select *" might be masking some unintended consequences. Listing individual fields would have made it easier for future maintenance and spot any potential bugs.

I've definitely had to fix a few stored procedures in my lifetime where a structural change to a table resulted in tracking down why the stored procedures stopped working suddenly.
 

  • Like 3
Link to comment
Share on other sites

Sorry, I know I am getting way OT with this Tracy stuff, but just wanted to let you know that the latest version adds the SQL Query column to the Selector Queries section of the Debug Mode panel. It seems to be stable enough to release, but let me know if you get any errors.

Hopefully this should be a useful learning tool.

image.thumb.png.1e8f27979ba62203cfcb8833685a6c85.png

  • Like 4
Link to comment
Share on other sites

 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...