MySQL 8.0统计信息不准确的原因

 更新时间:2020年08月27日 11:52:59   作者:Bright@DBA  
这篇文章主要介绍了MySQL 8.0统计信息不准确的原因,帮助大家更好的理解和学习MySQL8.0的相关内容,感兴趣的朋友可以了解下

前言

不管是Oracle还是MySQL,新版本推出的新特性,一方面给产品带来功能、性能、用户体验等方面的提升,另一方面也可能会带来一些问题,如代码bug、客户使用方法不正确引发问题等等。

案例分享

MySQL 5.7下的场景

(1)首先,创建两张表,并插入数据

mysql> select version();
+------------+
| version() |
+------------+
| 5.7.30-log |
+------------+
1 row in set (0.00 sec)

mysql> show create table test\G
*************************** 1. row ***************************
    Table: test
Create Table: CREATE TABLE `test` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `k` int(10) unsigned NOT NULL DEFAULT '0',
 `c` char(120) NOT NULL DEFAULT '',
 `pad` char(60) NOT NULL DEFAULT '',
 PRIMARY KEY (`id`),
 KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=101 DEFAULT CHARSET=utf8mb4 MAX_ROWS=1000000
1 row in set (0.00 sec)

mysql> show create table sbtest1\G
*************************** 1. row ***************************
    Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `k` int(10) unsigned NOT NULL DEFAULT '0',
 `c` char(120) NOT NULL DEFAULT '',
 `pad` char(60) NOT NULL DEFAULT '',
 PRIMARY KEY (`id`),
 KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=utf8mb4 MAX_ROWS=1000000
1 row in set (0.00 sec)

mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
|   100 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.14 sec)

(2)查看两张表的统计信息,均比较准确

mysql> select table_schema,table_name,table_rows from tables where table_name='test';
+--------------+------------+------------+
| table_schema | table_name | table_rows |
+--------------+------------+------------+
| test     | test    |    100 |
+--------------+------------+------------+
1 row in set (0.00 sec)

mysql> select table_schema,table_name,table_rows from tables where table_name='sbtest1';
+--------------+------------+------------+
| table_schema | table_name | table_rows |
+--------------+------------+------------+
| test     | sbtest1  |   947263 |
+--------------+------------+------------+
1 row in set (0.00 sec)

(3)我们持续往test表插入1000w条记录,并再次查看统计信息,还是相对准确的,因为在默认情况下,数据变化量超过10%,就会触发统计信息更新

mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
| 10000100 |
+----------+
1 row in set (1.50 sec)

mysql> select table_schema,table_name,table_rows from tables where table_name='test';
+--------------+------------+------------+
| table_schema | table_name | table_rows |
+--------------+------------+------------+
| test     | test    |  9749036 |
+--------------+------------+------------+
1 row in set (0.00 sec)

MySQL 8.0下的场景

(1)接下来我们看看8.0下的情况吧,同样地,我们创建两张表,并插入相同记录

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.20  |
+-----------+
1 row in set (0.00 sec)

mysql> show create table test\G
*************************** 1. row ***************************
    Table: test
Create Table: CREATE TABLE `test` (
 `id` int unsigned NOT NULL AUTO_INCREMENT,
 `k` int unsigned NOT NULL DEFAULT '0',
 `c` char(120) NOT NULL DEFAULT '',
 `pad` char(60) NOT NULL DEFAULT '',
 PRIMARY KEY (`id`),
 KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=101 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci MAX_ROWS=1000000
1 row in set (0.00 sec)

mysql> show create table sbtest1\G
*************************** 1. row ***************************
    Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
 `id` int unsigned NOT NULL AUTO_INCREMENT,
 `k` int unsigned NOT NULL DEFAULT '0',
 `c` char(120) NOT NULL DEFAULT '',
 `pad` char(60) NOT NULL DEFAULT '',
 PRIMARY KEY (`id`),
 KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci MAX_ROWS=1000000
1 row in set (0.00 sec)

mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
|   100 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.02 sec)

(2)查看两张表的统计信息,均比较准确

mysql> select table_schema,table_name,table_rows from tables where table_name='test';
+--------------+------------+------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS |
+--------------+------------+------------+
| test     | test    |    100 |
+--------------+------------+------------+
1 row in set (0.00 sec)

mysql> select table_schema,table_name,table_rows from tables where table_name='sbtest1';
+--------------+------------+------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS |
+--------------+------------+------------+
| test     | sbtest1  |   947468 |
+--------------+------------+------------+
1 row in set (0.01 sec)

(3)同样地,我们持续往test表插入1000w条记录,并再次查看统计信息,发现table_rows显示还是100条,出现了较大偏差

mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
| 10000100 |
+----------+
1 row in set (0.33 sec)

mysql> select table_schema,table_name,table_rows from tables where table_name='test';
+--------------+------------+------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS |
+--------------+------------+------------+
| test     | test    |    100 |
+--------------+------------+------------+
1 row in set (0.00 sec)

原因剖析

那么导致统计信息不准确的原因是什么呢?其实是MySQL 8.0为了提高information_schema的查询效率,将视图tables和statistics里面的统计信息缓存起来,缓存过期时间由参数information_schema_stats_expiry决定,默认为86400s;如果想获取最新的统计信息,可以通过如下两种方式:

(1)analyze table进行表分析

(2)设置information_schema_stats_expiry=0

继续探索

那么统计信息不准确,会带来哪些影响呢?是否会影响执行计划呢?接下来我们再次进行测试

测试1:表test记录数100,表sbtest1记录数100w

执行如下SQL,查看执行计划,走的是NLJ,小表test作为驱动表(全表扫描),大表sbtest1作为被驱动表(主键关联),执行效率很快

mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
|   100 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.02 sec)

mysql> select table_schema,table_name,table_rows from tables where table_name='test';
+--------------+------------+------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS |
+--------------+------------+------------+
| test     | test    |    100 |
+--------------+------------+------------+
1 row in set (0.00 sec)

mysql> select table_schema,table_name,table_rows from tables where table_name='sbtest1';
+--------------+------------+------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS |
+--------------+------------+------------+
| test     | sbtest1  |   947468 |
+--------------+------------+------------+
1 row in set (0.01 sec)

mysql> select t.* from test t inner join sbtest1 t1 on t.id=t1.id where t.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977' and t1.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977';
+----+--------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| id | k   | c                                                            | pad                             |
+----+--------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| 1 | 501885 | 08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977 | 63188288836-92351140030-06390587585-66802097351-49282961843 |
+----+--------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain select t.* from test t inner join sbtest1 t1 on t.id=t1.id where t.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977' and t1.c='08566691963-88624912351-16662227201-4664
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key   | key_len | ref    | rows | filtered | Extra    |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------+------+----------+-------------+
| 1 | SIMPLE   | t   | NULL    | ALL  | PRIMARY    | NULL  | NULL  | NULL   | 100 |  10.00 | Using where |
| 1 | SIMPLE   | t1  | NULL    | eq_ref | PRIMARY    | PRIMARY | 4    | test.t.id |  1 |  10.00 | Using where |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

测试2:表test记录数1000w左右,表sbtest1记录数100w

再次执行SQL,查看执行计划,走的也是NLJ,相对小表sbtest1作为驱动表,大表test作为被驱动表,也是正确的执行计划

mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
| 10000100 |
+----------+
1 row in set (0.33 sec)

mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.02 sec)

mysql> select table_schema,table_name,table_rows from tables where table_name='test';
+--------------+------------+------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS |
+--------------+------------+------------+
| test     | test    |    100 |
+--------------+------------+------------+
1 row in set (0.00 sec)

mysql> select table_schema,table_name,table_rows from tables where table_name='sbtest1';
+--------------+------------+------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS |
+--------------+------------+------------+
| test     | sbtest1  |   947468 |
+--------------+------------+------------+
1 row in set (0.01 sec)

mysql> select t.* from test t inner join sbtest1 t1 on t.id=t1.id where t.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977' and t1.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977';
+----+--------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| id | k   | c                                                            | pad                             |
+----+--------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| 1 | 501885 | 08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977 | 63188288836-92351140030-06390587585-66802097351-49282961843 |
+----+--------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
1 row in set (0.37 sec)

mysql> explain select t.* from test t inner join sbtest1 t1 on t.id=t1.id where t.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977' and t1.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977';
+----+-------------+-------+------------+--------+---------------+---------+---------+------------+--------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key   | key_len | ref    | rows  | filtered | Extra    |
+----+-------------+-------+------------+--------+---------------+---------+---------+------------+--------+----------+-------------+
| 1 | SIMPLE   | t1  | NULL    | ALL  | PRIMARY    | NULL  | NULL  | NULL    | 947468 |  10.00 | Using where |
| 1 | SIMPLE   | t   | NULL    | eq_ref | PRIMARY    | PRIMARY | 4    | test.t1.id |   1 |  10.00 | Using where |
+----+-------------+-------+------------+--------+---------------+---------+---------+------------+--------+----------+-------------+
2 rows in set, 1 warning (0.01 sec)

为什么优化器没有选择错误的执行计划呢?之前文章也提过,MySQL 8.0是将元数据信息存放在mysql库下的数据字典表里,information_schema库只是提供相对方便的视图供用户查询,所以优化器在选择执行计划时,会从数据字典表中获取统计信息,生成正确的执行计划。

总结

MySQL 8.0为了提高information_schema的查询效率,会将视图tables和statistics里面的统计信息缓存起来,缓存过期时间由参数information_schema_stats_expiry决定(建议设置该参数值为0);这可能会导致用户查询相应视图时,无法获取最新、准确的统计信息,但并不会影响执行计划的选择。

以上就是MySQL 8.0统计信息不准确的原因的详细内容,更多关于MySQL 8.0统计信息不准确的资料请关注脚本之家其它相关文章!

相关文章

  • MySQL数据库服务器逐渐变慢分析与解决方法分享

    MySQL数据库服务器逐渐变慢分析与解决方法分享

    本文针对MySQL数据库服务器逐渐变慢的问题, 进行分析,并提出相应的解决办法
    2012-01-01
  • Mysql DDL常见操作汇总

    Mysql DDL常见操作汇总

    这篇文章主要介绍了Mysql DDL常见操作汇总,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧
    2020-09-09
  • MySQL mysqladmin客户端的使用简介

    MySQL mysqladmin客户端的使用简介

    这篇文章主要介绍了MySQL mysqladmin客户端的使用简介,帮助大家更好的理解和学习使用MySQL,感兴趣的朋友可以了解下
    2021-03-03
  • mysql 5.7.20解压版安装方法步骤详解(两种方法)

    mysql 5.7.20解压版安装方法步骤详解(两种方法)

    本文给大家分享mysql 5.7.20解压版安装方法步骤详解,本文给大家分享两种方法,非常不错,具有参考借鉴价值,需要的朋友参考下吧
    2017-11-11
  • MySql分表、分库、分片和分区知识深入详解

    MySql分表、分库、分片和分区知识深入详解

    这篇文章主要介绍了MySql分表、分库、分片和分区知识深入详解,如果有并发场景和数据量较大的场景的可以看一下文章,对你会有或多或少的帮助
    2021-03-03
  • MySQL中查询、删除重复记录的方法大全

    MySQL中查询、删除重复记录的方法大全

    mysql中删除重复记录的方法有很多种,下面这篇文章主要给大家总结了在MySQL中查询、删除重复记录的方法大全,文中给出了详细的示例代码供大家参考学习,需要的朋友下面来一起看看吧。
    2017-06-06
  • mysql单字段多值分割和合并的处理方法

    mysql单字段多值分割和合并的处理方法

    这篇文章主要给大家介绍了关于mysql单字段多值分割和合并的处理方法,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧
    2021-01-01
  • 详解分析MySQL8.0的内存消耗

    详解分析MySQL8.0的内存消耗

    这篇文章主要介绍了详解分析MySQL8.0的内存消耗,帮助大家更好的理解和学习使用MySQL,感兴趣的朋友可以了解下
    2021-03-03
  • 浅析MySQL 备份与恢复

    浅析MySQL 备份与恢复

    这篇文章主要介绍了MySQL 备份与恢复的相关资料,帮助大家更好的理解和学习MySQL,感兴趣的朋友可以了解下
    2020-08-08
  • MySQL锁阻塞的深入分析

    MySQL锁阻塞的深入分析

    这篇文章主要给大家介绍了关于MySQL锁阻塞的相关资料,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧
    2020-12-12

最新评论