4.1 连接表达式 4.1.1 连接条件

4.1 连接表达式

在3.3.3节我们介绍了自然连接运算。SQL提供了连接运算的其他形式,包括能够指定显式的连接谓词(join predicate),能够在结果中包含被自然连接排除在外的元组。本节我们将讨论这些连接的形式。
本节的例子涉及studenttakes两个关系,如下所示:

student关系内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
mysql> select * from student;
+-------+----------+------------+----------+
| ID | name | dept_name | tot_cred |
+-------+----------+------------+----------+
| 00128 | Zhang | Comp. Sci. | 102 |
| 12345 | Shankar | Comp. Sci. | 32 |
| 19991 | Brandt | History | 80 |
| 23121 | Chavez | Finance | 110 |
| 44553 | Peltier | Physics | 56 |
| 45678 | Levy | Physics | 46 |
| 54321 | Williams | Comp. Sci. | 54 |
| 55739 | Sanchez | Music | 38 |
| 70557 | Snow | Physics | 0 |
| 76543 | Brown | Comp. Sci. | 58 |
| 76653 | Aoi | Elec. Eng. | 60 |
| 98765 | Bourikas | Elec. Eng. | 98 |
| 98988 | Tanaka | Biology | 120 |
+-------+----------+------------+----------+
13 rows in set (0.03 sec)

takes关系内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
mysql> select * from takes;
+-------+-----------+--------+----------+------+-------+
| ID | course_id | sec_id | semester | year | grade |
+-------+-----------+--------+----------+------+-------+
| 00128 | CS-101 | 1 | Fall | 2009 | A |
| 00128 | CS-347 | 1 | Fall | 2009 | A- |
| 12345 | CS-101 | 1 | Fall | 2009 | C |
| 12345 | CS-190 | 2 | Spring | 2009 | A |
| 12345 | CS-315 | 1 | Spring | 2010 | A |
| 12345 | CS-347 | 1 | Fall | 2009 | A |
| 19991 | HIS-351 | 1 | Spring | 2010 | B |
| 23121 | FIN-201 | 1 | Spring | 2010 | C+ |
| 44553 | PHY-101 | 1 | Fall | 2009 | B- |
| 45678 | CS-101 | 1 | Fall | 2009 | F |
| 45678 | CS-101 | 1 | Spring | 2010 | B+ |
| 45678 | CS-319 | 1 | Spring | 2010 | B |
| 54321 | CS-101 | 1 | Fall | 2009 | A- |
| 54321 | CS-190 | 2 | Spring | 2009 | B+ |
| 55739 | MU-199 | 1 | Spring | 2010 | A- |
| 76543 | CS-101 | 1 | Fall | 2009 | A |
| 76543 | CS-319 | 2 | Spring | 2010 | A |
| 76653 | EE-181 | 1 | Spring | 2009 | C |
| 98765 | CS-101 | 1 | Fall | 2009 | C- |
| 98765 | CS-315 | 1 | Spring | 2010 | B |
| 98988 | BIO-101 | 1 | Summer | 2009 | A |
| 98988 | BIO-301 | 1 | Summer | 2010 | NULL |
+-------+-----------+--------+----------+------+-------+
22 rows in set (0.06 sec)

注意到对于ID为98988的学生,他在2010夏季选修的BIO-301课程的1号课程段的grade属性为空值。该空值表示这门课程的成绩还没有得到。

4.1.1 连接条件

在3.3.3节我们介绍了如何表达自然连接,并且介绍了join…using子句,它是一种自然连接的形式,只需要在指定属性上的取值匹配
SQL支持另外一种形式的连接,其中可以指定任意的连接条件

on条件

on条件允许在参与连接的关系上设置通用的谓词。该谓词的写法与where子句谓词类似,只不过使用的是关键词on而不是where。与using条件样,on条件出现在连接表达式的末尾
考虑下面的查询,它具有包含on条件的连接表达式:

1
2
3
select *
from student
join takes on student.ID=takes.ID;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
mysql> select *
from student
join takes on student.ID=takes.ID;
+-------+----------+------------+----------+-------+-----------+--------+----------+------+-------+
| ID | name | dept_name | tot_cred | ID | course_id | sec_id | semester | year | grade |
+-------+----------+------------+----------+-------+-----------+--------+----------+------+-------+
| 00128 | Zhang | Comp. Sci. | 102 | 00128 | CS-101 | 1 | Fall | 2009 | A |
| 00128 | Zhang | Comp. Sci. | 102 | 00128 | CS-347 | 1 | Fall | 2009 | A- |
| 12345 | Shankar | Comp. Sci. | 32 | 12345 | CS-101 | 1 | Fall | 2009 | C |
| 12345 | Shankar | Comp. Sci. | 32 | 12345 | CS-190 | 2 | Spring | 2009 | A |
| 12345 | Shankar | Comp. Sci. | 32 | 12345 | CS-315 | 1 | Spring | 2010 | A |
| 12345 | Shankar | Comp. Sci. | 32 | 12345 | CS-347 | 1 | Fall | 2009 | A |
| 19991 | Brandt | History | 80 | 19991 | HIS-351 | 1 | Spring | 2010 | B |
| 23121 | Chavez | Finance | 110 | 23121 | FIN-201 | 1 | Spring | 2010 | C+ |
| 44553 | Peltier | Physics | 56 | 44553 | PHY-101 | 1 | Fall | 2009 | B- |
| 45678 | Levy | Physics | 46 | 45678 | CS-101 | 1 | Fall | 2009 | F |
| 45678 | Levy | Physics | 46 | 45678 | CS-101 | 1 | Spring | 2010 | B+ |
| 45678 | Levy | Physics | 46 | 45678 | CS-319 | 1 | Spring | 2010 | B |
| 54321 | Williams | Comp. Sci. | 54 | 54321 | CS-101 | 1 | Fall | 2009 | A- |
| 54321 | Williams | Comp. Sci. | 54 | 54321 | CS-190 | 2 | Spring | 2009 | B+ |
| 55739 | Sanchez | Music | 38 | 55739 | MU-199 | 1 | Spring | 2010 | A- |
| 76543 | Brown | Comp. Sci. | 58 | 76543 | CS-101 | 1 | Fall | 2009 | A |
| 76543 | Brown | Comp. Sci. | 58 | 76543 | CS-319 | 2 | Spring | 2010 | A |
| 76653 | Aoi | Elec. Eng. | 60 | 76653 | EE-181 | 1 | Spring | 2009 | C |
| 98765 | Bourikas | Elec. Eng. | 98 | 98765 | CS-101 | 1 | Fall | 2009 | C- |
| 98765 | Bourikas | Elec. Eng. | 98 | 98765 | CS-315 | 1 | Spring | 2010 | B |
| 98988 | Tanaka | Biology | 120 | 98988 | BIO-101 | 1 | Summer | 2009 | A |
| 98988 | Tanaka | Biology | 120 | 98988 | BIO-301 | 1 | Summer | 2010 | NULL |
+-------+----------+------------+----------+-------+-----------+--------+----------+------+-------+
22 rows in set (0.08 sec)

上述on条件表明:如果一个来自student的元组和一个来自takes的元组在ID上的取值相同,那么它们是匹配的。

on条件和自然连接的区别

在上例中的连接表达式与连接表达式student natural join takes几乎是一样的,因为自然连接运算也需要studen元组和takes元组是匹配的。
这两者之间的一个区别在于:在上述连接查询结果中,ID属性出现两次,一次是student中的,另一次是takes中的,即便它们的ID属性值是相同的。

1
2
select *
from student natural join takes;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
mysql> select *
from student natural join takes;
+-------+----------+------------+----------+-----------+--------+----------+------+-------+
| ID | name | dept_name | tot_cred | course_id | sec_id | semester | year | grade |
+-------+----------+------------+----------+-----------+--------+----------+------+-------+
| 00128 | Zhang | Comp. Sci. | 102 | CS-101 | 1 | Fall | 2009 | A |
| 00128 | Zhang | Comp. Sci. | 102 | CS-347 | 1 | Fall | 2009 | A- |
| 12345 | Shankar | Comp. Sci. | 32 | CS-101 | 1 | Fall | 2009 | C |
| 12345 | Shankar | Comp. Sci. | 32 | CS-190 | 2 | Spring | 2009 | A |
| 12345 | Shankar | Comp. Sci. | 32 | CS-315 | 1 | Spring | 2010 | A |
| 12345 | Shankar | Comp. Sci. | 32 | CS-347 | 1 | Fall | 2009 | A |
| 19991 | Brandt | History | 80 | HIS-351 | 1 | Spring | 2010 | B |
| 23121 | Chavez | Finance | 110 | FIN-201 | 1 | Spring | 2010 | C+ |
| 44553 | Peltier | Physics | 56 | PHY-101 | 1 | Fall | 2009 | B- |
| 45678 | Levy | Physics | 46 | CS-101 | 1 | Fall | 2009 | F |
| 45678 | Levy | Physics | 46 | CS-101 | 1 | Spring | 2010 | B+ |
| 45678 | Levy | Physics | 46 | CS-319 | 1 | Spring | 2010 | B |
| 54321 | Williams | Comp. Sci. | 54 | CS-101 | 1 | Fall | 2009 | A- |
| 54321 | Williams | Comp. Sci. | 54 | CS-190 | 2 | Spring | 2009 | B+ |
| 55739 | Sanchez | Music | 38 | MU-199 | 1 | Spring | 2010 | A- |
| 76543 | Brown | Comp. Sci. | 58 | CS-101 | 1 | Fall | 2009 | A |
| 76543 | Brown | Comp. Sci. | 58 | CS-319 | 2 | Spring | 2010 | A |
| 76653 | Aoi | Elec. Eng. | 60 | EE-181 | 1 | Spring | 2009 | C |
| 98765 | Bourikas | Elec. Eng. | 98 | CS-101 | 1 | Fall | 2009 | C- |
| 98765 | Bourikas | Elec. Eng. | 98 | CS-315 | 1 | Spring | 2010 | B |
| 98988 | Tanaka | Biology | 120 | BIO-101 | 1 | Summer | 2009 | A |
| 98988 | Tanaka | Biology | 120 | BIO-301 | 1 | Summer | 2010 | NULL |
+-------+----------+------------+----------+-----------+--------+----------+------+-------+
22 rows in set (0.07 sec)

on条件改成where

实际上,上述查询与以下查询产生的结果是完全相同的。

1
2
3
select *
from student,takes
where student.ID=takes.ID;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
mysql> select *
from student,takes
where student.ID=takes.ID;
+-------+----------+------------+----------+-------+-----------+--------+----------+------+-------+
| ID | name | dept_name | tot_cred | ID | course_id | sec_id | semester | year | grade |
+-------+----------+------------+----------+-------+-----------+--------+----------+------+-------+
| 00128 | Zhang | Comp. Sci. | 102 | 00128 | CS-101 | 1 | Fall | 2009 | A |
| 00128 | Zhang | Comp. Sci. | 102 | 00128 | CS-347 | 1 | Fall | 2009 | A- |
| 12345 | Shankar | Comp. Sci. | 32 | 12345 | CS-101 | 1 | Fall | 2009 | C |
| 12345 | Shankar | Comp. Sci. | 32 | 12345 | CS-190 | 2 | Spring | 2009 | A |
| 12345 | Shankar | Comp. Sci. | 32 | 12345 | CS-315 | 1 | Spring | 2010 | A |
| 12345 | Shankar | Comp. Sci. | 32 | 12345 | CS-347 | 1 | Fall | 2009 | A |
| 19991 | Brandt | History | 80 | 19991 | HIS-351 | 1 | Spring | 2010 | B |
| 23121 | Chavez | Finance | 110 | 23121 | FIN-201 | 1 | Spring | 2010 | C+ |
| 44553 | Peltier | Physics | 56 | 44553 | PHY-101 | 1 | Fall | 2009 | B- |
| 45678 | Levy | Physics | 46 | 45678 | CS-101 | 1 | Fall | 2009 | F |
| 45678 | Levy | Physics | 46 | 45678 | CS-101 | 1 | Spring | 2010 | B+ |
| 45678 | Levy | Physics | 46 | 45678 | CS-319 | 1 | Spring | 2010 | B |
| 54321 | Williams | Comp. Sci. | 54 | 54321 | CS-101 | 1 | Fall | 2009 | A- |
| 54321 | Williams | Comp. Sci. | 54 | 54321 | CS-190 | 2 | Spring | 2009 | B+ |
| 55739 | Sanchez | Music | 38 | 55739 | MU-199 | 1 | Spring | 2010 | A- |
| 76543 | Brown | Comp. Sci. | 58 | 76543 | CS-101 | 1 | Fall | 2009 | A |
| 76543 | Brown | Comp. Sci. | 58 | 76543 | CS-319 | 2 | Spring | 2010 | A |
| 76653 | Aoi | Elec. Eng. | 60 | 76653 | EE-181 | 1 | Spring | 2009 | C |
| 98765 | Bourikas | Elec. Eng. | 98 | 98765 | CS-101 | 1 | Fall | 2009 | C- |
| 98765 | Bourikas | Elec. Eng. | 98 | 98765 | CS-315 | 1 | Spring | 2010 | B |
| 98988 | Tanaka | Biology | 120 | 98988 | BIO-101 | 1 | Summer | 2009 | A |
| 98988 | Tanaka | Biology | 120 | 98988 | BIO-301 | 1 | Summer | 2010 | NULL |
+-------+----------+------------+----------+-------+-----------+--------+----------+------+-------+
22 rows in set (0.07 sec)

on条件只显示一次重复的属性 不使用select *

正如我们此前所见,关系名用来区分属性名D,这样ID的两次出现被分别表示为student.IDtakes.ID
只显示一次ID值的查询版本如下:

1
2
3
select student.ID as ID, name, dept_name, tot_cred,course_id, sec_id, semester, year, grade
from student
join takes on student.ID= takes.ID;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
mysql> select student.ID as ID, name, dept_name, tot_cred,course_id, sec_id, semester, year, grade
from student
join takes on student.ID= takes.ID;
+-------+----------+------------+----------+-----------+--------+----------+------+-------+
| ID | name | dept_name | tot_cred | course_id | sec_id | semester | year | grade |
+-------+----------+------------+----------+-----------+--------+----------+------+-------+
| 00128 | Zhang | Comp. Sci. | 102 | CS-101 | 1 | Fall | 2009 | A |
| 00128 | Zhang | Comp. Sci. | 102 | CS-347 | 1 | Fall | 2009 | A- |
| 12345 | Shankar | Comp. Sci. | 32 | CS-101 | 1 | Fall | 2009 | C |
| 12345 | Shankar | Comp. Sci. | 32 | CS-190 | 2 | Spring | 2009 | A |
| 12345 | Shankar | Comp. Sci. | 32 | CS-315 | 1 | Spring | 2010 | A |
| 12345 | Shankar | Comp. Sci. | 32 | CS-347 | 1 | Fall | 2009 | A |
| 19991 | Brandt | History | 80 | HIS-351 | 1 | Spring | 2010 | B |
| 23121 | Chavez | Finance | 110 | FIN-201 | 1 | Spring | 2010 | C+ |
| 44553 | Peltier | Physics | 56 | PHY-101 | 1 | Fall | 2009 | B- |
| 45678 | Levy | Physics | 46 | CS-101 | 1 | Fall | 2009 | F |
| 45678 | Levy | Physics | 46 | CS-101 | 1 | Spring | 2010 | B+ |
| 45678 | Levy | Physics | 46 | CS-319 | 1 | Spring | 2010 | B |
| 54321 | Williams | Comp. Sci. | 54 | CS-101 | 1 | Fall | 2009 | A- |
| 54321 | Williams | Comp. Sci. | 54 | CS-190 | 2 | Spring | 2009 | B+ |
| 55739 | Sanchez | Music | 38 | MU-199 | 1 | Spring | 2010 | A- |
| 76543 | Brown | Comp. Sci. | 58 | CS-101 | 1 | Fall | 2009 | A |
| 76543 | Brown | Comp. Sci. | 58 | CS-319 | 2 | Spring | 2010 | A |
| 76653 | Aoi | Elec. Eng. | 60 | EE-181 | 1 | Spring | 2009 | C |
| 98765 | Bourikas | Elec. Eng. | 98 | CS-101 | 1 | Fall | 2009 | C- |
| 98765 | Bourikas | Elec. Eng. | 98 | CS-315 | 1 | Spring | 2010 | B |
| 98988 | Tanaka | Biology | 120 | BIO-101 | 1 | Summer | 2009 | A |
| 98988 | Tanaka | Biology | 120 | BIO-301 | 1 | Summer | 2010 | NULL |
+-------+----------+------------+----------+-----------+--------+----------+------+-------+
22 rows in set (0.07 sec)

on条件比自然连接功能更强

on条件可以表示任何SQL谓词,从而使用on条件的连接表达式就可以表示比自然连接更为丰富的连接条件

on条件可以使用where替代

然而,正如上例所示,使用带on条件的连接表达式的查询可以用不带on条件的等价表达式来替换,只要把on子句中的谓词移到where子句中即可。这样看来,on条件似乎是一个冗余的SQL特征

on条件的优点

但是,引入on条件有两个优点。

  • 首先,对于我们马上要介绍的,被称作外连接的这类连接来说,on条件的表现与where条件是不同的。
  • 其次,如果在on子句中指定连接条件,并在where子句中出现其余的条件,这样的SQL查询通常更容易让人读懂。