Sometimes a parent table have more than one table with multiple references to a field in the parent table. Then the question is how to count references from more than one table in a single query. I have found 5 different ways to achieve this result, and therefore I have run a few tests, to see which was better.
To test this I made this small PHP script, that creates a parent table and child table. Then a loop populates the tables with 1,000 parent rows, approximately 15 children of type 1 and approximately 75 children of type 2. This is done before each query, to make sure that all queries have the same conditions.
-
Dependent subquery
Execution time: 0.06 sec -
Cross join
Execution time: 1.01 sec -
Derived table / Inline view
Execution time: 0.15 sec -
Merge view
Creating views: 0.005 sec
Execution time: 0.15 sec -
Temptable view
Creating views: 0.005 sec
Execution time: 0.15 sec
So the conclusion is that using dependent subqueries is actually extremely fast. In this case precisely 3 times as fast as using views and 15 times faster than joining the tables and count distinct children. Notice that it takes almost no time to create the views, so it does not do any difference if the view is created inline as a derived table or as an actual view. One should expect selecting from a view to be faster, but this might be slower because the view is not indexed, and therefore joining is slower. In Oracle it is possible to create indexes on a view, even on an inline view, and therefore this method might be preferred.
By the way, a sixth way exist by first selecting from parent, and then for each parent make a new query to count each child table. Never do this, making so many queries takes much more time than using subqueries.
Here is the PHP script I used to make the tests, you can copy it to reproduce the tests. Just remember to adjust the MySQL connection string and the database name. Also make sure that you do not have a table names parents, children, children_view_1 or children_view_2, as these will be dropped on execution. Also you might prefer to execute in a command prompt to avoid timeout, event though it does not take many minutes to execute.
- <?
- mysql_connect("localhost", "root", "test");
- mysql_select_db("test");
- function test_query($q, $alg = false) {
- for ($n = 0; $n < 3; $n++) {
- mysql_query("DROP VIEW children_view_1");
- mysql_query("DROP VIEW children_view_2");
- mysql_query("DROP TABLE parents");
- mysql_query("DROP TABLE children");
- mysql_query("CREATE TABLE parents (
- id int(6) not null primary key auto_increment
- )");
- mysql_query("CREATE TABLE children (
- id int(6) not null primary key auto_increment,
- parent_id int(6) not null,
- type tinyint(1) not null,
- index(parent_id, type),
- index(type, parent_id)
- )");
- for ($i = 1; $i <= 1000; $i++) {
- mysql_query("INSERT INTO parents(id) VALUES('".$i."')");
- mysql_query("INSERT INTO children (parent_id, type)
- VALUES('".$i."', '1')".str_repeat(",('".$i."', '1')", rand(10, 20)));
- mysql_query("INSERT INTO children (parent_id, type)
- VALUES('".$i."', '2')".str_repeat(",('".$i."', '2')", rand(50, 100)));
- //if ($i % 100 == 0) echo ($i/10)."%\n";
- }
- $start = microtime(true);
- if ($alg) {
- mysql_query("CREATE ALGORITHM=".$alg." VIEW children_view_1 AS
- SELECT parent_id, COUNT(*) AS n_1 FROM children
- WHERE type = 1 GROUP BY parent_id");
- mysql_query("CREATE ALGORITHM=".$alg." VIEW children_view_2 AS
- SELECT parent_id, COUNT(*) AS n_2 FROM children
- WHERE type = 2 GROUP BY parent_id");
- echo "View: ".(microtime(true)-$start)." sec\n";
- }
- mysql_query($q);
- echo mysql_error();
- echo "Query: ".(microtime(true)-$start)." sec\n";
- }
- }
- echo "SUBQUERY STYLE\n";
- test_query("SELECT a.id,
- (SELECT COUNT(b.id) FROM children b WHERE b.parent_id = a.id AND b.type = 1) AS n_1,
- (SELECT COUNT(b.id) FROM children b WHERE b.parent_id = a.id AND b.type = 2) AS n_2
- FROM parents a;");
- echo "CROSS JOIN STYLE\n";
- test_query("SELECT a.id,
- COUNT(DISTINCT b.id) AS n_1,
- COUNT(DISTINCT c.id) AS n_2
- FROM parents a
- LEFT JOIN children b ON(b.parent_id = a.id AND b.type = 1)
- LEFT JOIN children c ON(c.parent_id = a.id AND c.type = 2)
- GROUP BY a.id;");
- echo "DERIVED TABLE/INLINE VIEW STYLE\n";
- test_query("SELECT a.id, b.n_1, c.n_2 FROM parents a
- LEFT JOIN (SELECT parent_id, COUNT(*) AS n_1 FROM children
- WHERE type = 1 GROUP BY parent_id) b ON (b.parent_id = a.id)
- LEFT JOIN (SELECT parent_id, COUNT(*) AS n_2 FROM children
- WHERE type = 2 GROUP BY parent_id) c ON (c.parent_id = a.id);");
- echo "MERGE VIEW STYLE\n";
- test_query("SELECT a.id, b.n_1, c.n_2
- FROM parents a
- LEFT OUTER JOIN children_view_1 b ON (b.parent_id = a.id)
- LEFT OUTER JOIN children_view_2 c ON (c.parent_id = a.id);", "MERGE");
- echo "TEMPTABLE VIEW STYLE\n";
- test_query("SELECT a.id, b.n_1, c.n_2
- FROM parents a
- LEFT OUTER JOIN children_view_1 b ON (b.parent_id = a.id)
- LEFT OUTER JOIN children_view_2 c ON (c.parent_id = a.id);", "TEMPTABLE");
- ?>
Genial dispatch and this post helped me alot in my college assignement. Gratefulness you for your information.
it was very interesting to read http://www.5p.dk
I want to quote your post in my blog. It can?
And you et an account on Twitter?