In MySQL 5.1 before 5.1.16, if you compare a
          NULL value to a subquery using
          ALL, ANY, or
          SOME, and the subquery returns an empty
          result, the comparison might evaluate to the nonstandard
          result of NULL rather than to
          TRUE or FALSE. As of
          5.1.16, the comparison evaluates to TRUE or
          FALSE except for subqueries inside
          IS NULL, such as this:
        
SELECT ... WHERE NULL IN (SELECT ...) IS NULL
          As of 5.1.32, the IS NULL limitation is
          removed and the comparison evaluates to
          TRUE or FALSE.
        
          A subquery's outer statement can be any one of:
          SELECT,
          INSERT,
          UPDATE,
          DELETE,
          SET, or
          DO.
        
          Subquery optimization for IN is not as
          effective as for the = operator or for the
          IN(
          operator.
        value_list)
          A typical case for poor IN subquery
          performance is when the subquery returns a small number of
          rows but the outer query returns a large number of rows to be
          compared to the subquery result.
        
          The problem is that, for a statement that uses an
          IN subquery, the optimizer rewrites it as a
          correlated subquery. Consider the following statement that
          uses an uncorrelated subquery:
        
SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);
The optimizer rewrites the statement to a correlated subquery:
SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);
          If the inner and outer queries return
          M and N
          rows, respectively, the execution time becomes on the order of
          O(,
          rather than
          M×N)O(
          as it would be for an uncorrelated subquery.
        M+N)
          An implication is that an IN subquery can
          be much slower than a query written using an
          IN(
          operator that lists the same values that the subquery would
          return.
        value_list)
In general, you cannot modify a table and select from the same table in a subquery. For example, this limitation applies to statements of the following forms:
DELETE FROM t WHERE ... (SELECT ... FROM t ...);
UPDATE t ... WHERE col = (SELECT ... FROM t ...);
{INSERT|REPLACE} INTO t (SELECT ... FROM t ...);
          Exception: The preceding prohibition does not apply if you are
          using a subquery for the modified table in the
          FROM clause. Example:
        
UPDATE t ... WHERE col = (SELECT * FROM (SELECT ... FROM t...) AS _t ...);
          Here the prohibition does not apply because the result from a
          subquery in the FROM clause is stored as a
          temporary table, so the relevant rows in t
          have already been selected by the time the update to
          t takes place.
        
Row comparison operations are only partially supported:
              For expr IN
              (subquery)expr can be an
              n-tuple (specified via row
              constructor syntax) and the subquery can return rows of
              n-tuples.
            
              For expr
              op {ALL|ANY|SOME}
              (subquery)expr must be a scalar value and
              the subquery must be a column subquery; it cannot return
              multiple-column rows.
            
          In other words, for a subquery that returns rows of
          n-tuples, this is supported:
        
(val_1, ...,val_n) IN (subquery)
But this is not supported:
(val_1, ...,val_n)op{ALL|ANY|SOME} (subquery)
          The reason for supporting row comparisons for
          IN but not for the others is that
          IN is implemented by rewriting it as a
          sequence of =
          comparisons and AND operations.
          This approach cannot be used for ALL,
          ANY, or SOME.
        
Row constructors are not well optimized. The following two expressions are equivalent, but only the second can be optimized:
(col1, col2, ...) = (val1, val2, ...) col1 = val1 AND col2 = val2 AND ...
          Subqueries in the FROM clause cannot be
          correlated subqueries. They are materialized (executed to
          produce a result set) before evaluating the outer query, so
          they cannot be evaluated per row of the outer query.
        
The optimizer is more mature for joins than for subqueries, so in many cases a statement that uses a subquery can be executed more efficiently if you rewrite it as a join.
          An exception occurs for the case where an
          IN subquery can be rewritten as a
          SELECT
          DISTINCT join. Example:
        
SELECT col FROM t1 WHERE id_col IN (SELECT id_col2 FROM t2 WHERE condition);
That statement can be rewritten as follows:
SELECT DISTINCT col FROM t1, t2 WHERE t1.id_col = t2.id_col AND condition;
          But in this case, the join requires an extra
          DISTINCT operation and is not more
          efficient than the subquery.
        
Possible future optimization: MySQL does not rewrite the join order for subquery evaluation. In some cases, a subquery could be executed more efficiently if MySQL rewrote it as a join. This would give the optimizer a chance to choose between more execution plans. For example, it could decide whether to read one table or the other first.
Example:
SELECT a FROM outer_table AS ot WHERE a IN (SELECT a FROM inner_table AS it WHERE ot.b = it.b);
          For that query, MySQL always scans
          outer_table first and then executes the
          subquery on inner_table for each row. If
          outer_table has a lot of rows and
          inner_table has few rows, the query
          probably will not be as fast as it could be.
        
The preceding query could be rewritten like this:
SELECT a FROM outer_table AS ot, inner_table AS it WHERE ot.a = it.a AND ot.b = it.b;
          In this case, we can scan the small table
          (inner_table) and look up rows in
          outer_table, which will be fast if there is
          an index on (ot.a,ot.b).
        
Possible future optimization: A correlated subquery is evaluated for each row of the outer query. A better approach is that if the outer row values do not change from the previous row, do not evaluate the subquery again. Instead, use its previous result.
          Possible future optimization: A subquery in the
          FROM clause is evaluated by materializing
          the result into a temporary table, and this table does not use
          indexes. This does not allow the use of indexes in comparison
          with other tables in the query, although that might be useful.
        
          Possible future optimization: If a subquery in the
          FROM clause resembles a view to which the
          merge algorithm can be applied, rewrite the query and apply
          the merge algorithm so that indexes can be used. The following
          statement contains such a subquery:
        
SELECT * FROM (SELECT * FROM t1 WHERE t1.t1_col)
         AS _t1, t2 WHERE t2.t2_col;
The statement can be rewritten as a join like this:
SELECT * FROM t1, t2 WHERE t1.t1_col AND t2.t2_col;
This type of rewriting would provide two benefits:
              It avoids the use of a temporary table for which no
              indexes can be used. In the rewritten query, the optimizer
              can use indexes on t1.
            
              It gives the optimizer more freedom to choose between
              different execution plans. For example, rewriting the
              query as a join allows the optimizer to use
              t1 or t2 first.
            
          Possible future optimization: For IN,
          = ANY, <> ANY,
          = ALL, and <> ALL
          with uncorrelated subqueries, use an in-memory hash for a
          result or a temporary table with an index for larger results.
          Example:
        
SELECT a FROM big_table AS bt WHERE non_key_field IN (SELECT non_key_field FROMtableWHEREcondition)
In this case, we could create a temporary table:
CREATE TABLE t (key (non_key_field)) (SELECT non_key_field FROMtableWHEREcondition)
          Then, for each row in big_table, do a key
          lookup in t based on
          bt.non_key_field.
        


User Comments
It seems like this appendix would serve well with a cross-reference to the "This version of MySQL does not yet support
'LIMIT & IN/ALL/ANY/SOME subquery'" self-explanatory error message, which appears to be missing from this list. Here you go:
http://dev.mysql.com/doc/refman/5.0/en/subquery-errors.html
Add your own comment.