0. 索引很重要
之前列举记录用了下面的语句。state字段为索引。
SELECT * FROM feed_urls WHERE state='ok' AND feed_url<>'' LIMIT N,10
当记录数量很大时,有几万之后,这句SQL就很慢了。主要是因为feed_url没有建立索引。后来的解决方法是,把feed_url为空的,设为一个ok以外的state值,就行了。
1. 索引不是万能的
为了计算记录总数,下面的语句会很慢。
mysql> SELECT COUNT(*) FROM feed_urls WHERE state='error';
+----------+
| COUNT(*) |
+----------+
| 30715 |
+----------+
1 row in set (0.14 sec)
mysql> EXPLAIN SELECT COUNT(*) FROM feed_urls WHERE state='error'G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: feed_urls
type: ref
possible_keys: state,page_index
key: page_index
key_len: 10
ref: const
rows: 25936
Extra: Using where; Using index
1 row in set (0.00 sec)
state为索引,请求用时140ms。遍历了state='error'索引下的每一条记录。
mysql> SELECT state,COUNT(*) FROM feed_urls GROUP BY state;
+----------+----------+
| state | COUNT(*) |
+----------+----------+
| error | 30717 |
| fetching | 8 |
| nofeed | 76461 |
| ok | 74703 |
| queued | 249681 |
+----------+----------+
5 rows in set (0.55 sec)
mysql> EXPLAIN SELECT state,COUNT(*) FROM feed_urls GROUP BY stateG
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: feed_urls
type: index
possible_keys: NULL
key: state
key_len: 10
ref: NULL
rows: 431618
Extra: Using index
1 row in set (0.00 sec)
请求用时550ms。遍历了每个state下的每一条记录。
改进方法:
独立一个表用来计数,使用MySQL的Trigger同步计数:
CREATE TRIGGER my_trigger AFTER UPDATE ON feed_urls
FOR EACH ROW BEGIN
IF OLD.state <> NEW.state THEN
IF NEW.state='ok' THEN
UPDATE feed_stat SET count_feed = count_feed + 1;
END IF;
IF NEW.state IN ('ok', 'error', 'nofeed') THEN
UPDATE feed_stat SET count_access = count_access + 1;
END IF;
END IF;
END
2. 当分页很大时
mysql> SELECT * FROM feed_urls LIMIT 230000, 1G
*************************** 1. row ***************************
id: 736841f82abb0bc87ccfec7c0fdbd09c30b5a24d
link: http://mappemunde.typepad/
title: Tim Peterson
feed_url: NULL
update_time: 2012-05-12 11:01:56
state: queued
http_server: NULL
abstract: NULL
previous_id: ceea30e0ba609b69198c53ce71c44070d69038c5
ref_count: 1
error: NULL
aid: 230001
1 row in set (0.50 sec)
mysql> EXPLAIN SELECT * FROM feed_urls LIMIT 230000, 1G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: feed_urls
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 431751
Extra:
1 row in set (0.00 sec)
读取一条记录,耗时500ms,因为表记录是变长的,所以MySQL不能算出目标位置,只能每一条记录的数过去。
改进方法:
通过索引定位,数索引比数记录要快,因为索引占用的空间比整条记录小很多。
mysql> SELECT * FROM (SELECT aid FROM feed_urls ORDER BY aid LIMIT 215000, 1) d JOIN feed_urls u ON d.aid=u.aidG