gRegor Posted February 12 Posted February 12 I have a PW site search that's used the ~= selector ("Contains all words" fulltext) for quite a while with no problems, but recently started occasionally throwing MySQL exception "3699: Timeout exceeded in regular expression match". My quick fix was to fallback to ~%= ("Contains all words like"), so it's not an active problem, but I'm curious if anyone else has run into this and has ideas for optimization. PW: 3.0.299 MySQL: 8.0.44 Search phrase entered: AMA Update Query from TracyDebugger (tried to format it a bit for better readability): SELECT pages.id ,pages.parent_id ,pages.templates_id ,MATCH(field_body.data) AGAINST('+update' IN BOOLEAN MODE) AS _score_field_body_data0 FROM `pages` JOIN field_body AS field_body ON field_body.pages_id=pages.id AND (((( MATCH(field_body.data) AGAINST('+update' IN BOOLEAN MODE) AND (field_body.data RLIKE '(\\b|[[:blank:]]|[[:punct:]]|[[:space:]]|^|[-]|>|‘|“|„|«|‹|¿|¡)AMA(\\b|[[:blank:]]|[[:punct:]]|[[:space:]]|$|[-]|<|’|”|»|›)') )))) WHERE (pages.status<1024) GROUP BY pages.id ORDER BY _score_field_body_data0 DESC; I'm new to the regex query options in MySQL, but here's some things I've found: Quote It is possible to write regular expressions that, left to their own devices, would consume a lot of CPU time or memory, especially using computation-heavy constructions like back-references. This is a well-known fact. That’s why we offer two ways of controlling execution space/time. Say hello to regexp_stack_limit and regexp_time_limit. These integer-typed global variables make the regular expression engine stop after a limit has been reached, so it won’t hog the CPU indefinitely or exhaust the memory. Here are some examples. https://dev.mysql.com/blog-archive/new-regular-expression-functions-in-mysql-8-0/
BitPoet Posted February 13 Posted February 13 There aren't any backreferences in the regex, so it's probably just a question of table+field size, but you can try upping regexp_time_limit. I wouldn't go into the three digits though if you have significant concurrent requests. Regular expressions are expensive, and tweaking limits only gets you so far before overall DB performance degrades too much. I was scratching my head a bit why a ~= would invoke a an rlike, but it's probably because AMA is only three letters, and minimum word length for MyISAM defaults to 4 (parameter ft_min_word_len). You may want to lower that to two (which means rebuilding the fulltext indexes), then PW will use native fulltext search instead of the regex workaround for short words. 2
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now