如何避免多对多查询中的“使用临时”？

施锋

2023-03-14

问题内容：

这个查询非常简单，我要做的就是按last_updated字段对给定类别中的所有文章进行排序：

SELECT
    `articles`.*
FROM
    `articles`,
    `articles_to_categories`
WHERE
        `articles`.`id` = `articles_to_categories`.`article_id`
        AND `articles_to_categories`.`category_id` = 1
ORDER BY `articles`.`last_updated` DESC
LIMIT 0, 20;

但是它运行非常缓慢。这是EXPLAIN所说的：

select_type  table                   type     possible_keys           key         key_len  ref                                rows  Extra
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SIMPLE       articles_to_categories  ref      article_id,category_id  article_id  5        const                              5016  Using where; Using temporary; Using filesort
SIMPLE       articles                eq_ref   PRIMARY                 PRIMARY     4        articles_to_categories.article_id  1

有没有一种方法可以重写此查询或向我的PHP脚本中添加其他逻辑，以避免Using temporary; Using filesort并加快速度？

表结构：

*articles*
id | title | content | last_updated

*articles_to_categories*
article_id | category_id

更新

我已经last_updated索引了。我想我的情况在文档中得到了解释：

在某些情况下，MySQL无法使用索引来解析ORDER BY，尽管它仍然使用索引来查找与WHERE子句匹配的行。这些情况包括：

用于获取行的键与在ORDER BY中使用的键不同：SELECT * FROM t1 WHERE key2 = constant ORDER BY
key1;

您正在联接许多表，并且ORDER BY中的列并非全部来自用于检索行的第一个非恒定表。（这是EXPLAIN输出中的第一个没有const
join类型的表。）

但我仍然不知道如何解决此问题。

问题答案：

这是我之前为类似性能相关问题所做的一个简化示例，该问题利用了innodb集群主键索引（显然仅适用于innodb！）。

http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html
http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/

您有3个表：category，product和product_category，如下所示：

drop table if exists product;
create table product
(
prod_id int unsigned not null auto_increment primary key,
name varchar(255) not null unique
)
engine = innodb;

drop table if exists category;
create table category
(
cat_id mediumint unsigned not null auto_increment primary key,
name varchar(255) not null unique
)
engine = innodb;

drop table if exists product_category;
create table product_category
(
cat_id mediumint unsigned not null,
prod_id int unsigned not null,
primary key (cat_id, prod_id) -- **note the clustered composite index** !!
)
engine = innodb;

最重要的是 product_catgeory集群复合主键 的 顺序， 因为在这种情况下，典型的查询总是以（x，y，z
…）中的cat_id = x或cat_id开头。

我们有 50万个 类别， 100万个 产品和 1.25亿个 产品类别。

select count(*) from category;
+----------+
| count(*) |
+----------+
|   500000 |
+----------+

select count(*) from product;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+

select count(*) from product_category;
+-----------+
| count(*)  |
+-----------+
| 125611877 |
+-----------+

因此，让我们看看该架构如何对与您的查询相似的查询执行。所有查询都在空缓冲区且无查询缓存的情况下冷运行（在mysql重新启动后）。

select
 p.*
from
 product p
inner join product_category pc on 
    pc.cat_id = 4104 and pc.prod_id = p.prod_id
order by
 p.prod_id desc -- sry dont a date field in this sample table - wont make any difference though
limit 20;

+---------+----------------+
| prod_id | name           |
+---------+----------------+
|  993561 | Product 993561 |
|  991215 | Product 991215 |
|  989222 | Product 989222 |
|  986589 | Product 986589 |
|  983593 | Product 983593 |
|  982507 | Product 982507 |
|  981505 | Product 981505 |
|  981320 | Product 981320 |
|  978576 | Product 978576 |
|  973428 | Product 973428 |
|  959384 | Product 959384 |
|  954829 | Product 954829 |
|  953369 | Product 953369 |
|  951891 | Product 951891 |
|  949413 | Product 949413 |
|  947855 | Product 947855 |
|  947080 | Product 947080 |
|  945115 | Product 945115 |
|  943833 | Product 943833 |
|  942309 | Product 942309 |
+---------+----------------+
20 rows in set (0.70 sec)

explain
select
 p.*
from
 product p
inner join product_category pc on 
    pc.cat_id = 4104 and pc.prod_id = p.prod_id
order by
 p.prod_id desc -- sry dont a date field in this sample table - wont make any diference though
limit 20;

+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref           | rows | Extra                                        |
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
|  1 | SIMPLE      | pc    | ref    | PRIMARY       | PRIMARY | 3       | const           |  499 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | p     | eq_ref | PRIMARY       | PRIMARY | 4       | vl_db.pc.prod_id |    1 |                                              |
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
2 rows in set (0.00 sec)

因此，那是0.70秒的寒冷-哎呀。

希望这可以帮助：）

编辑

刚刚阅读完您对以上评论的答复，看来您可以选择以下两种选择之一：

create table articles_to_categories
(
article_id int unsigned not null,
category_id mediumint unsigned not null,
primary key(article_id, category_id), -- good for queries that lead with article_id = x
key (category_id)
)
engine=innodb;

要么。

create table categories_to_articles
(
article_id int unsigned not null,
category_id mediumint unsigned not null,
primary key(category_id, article_id), -- good for queries that lead with category_id = x
key (article_id)
)
engine=innodb;

取决于您有关如何定义群集PK的典型查询。

如何避免多对多查询中的“使用临时”？

相关阅读

相关文章

相关问答

相关工具

相关文档