| [Relational Theory] | [Carol's HomePage] |
![]()
![]()
A candidate key uniquely identifies each row in a relational table.
To find a candidate key for a relation, we can use FDs to find a
minimal set of attributes in the relation which
functionally determines (uniquely identifies) all the attributes in the relation.
An attribute always
functionally determines itself - trivial but true.
![]()
Elmasri and Navathe, in Fundamentals of Database Systems,
give an algorithm for
Determining X+, the Closure of X under F
where X is a subset of the attibutes in a given relation,
and F is a set of functional dependencies specified for the given relation.
This algorithm allows us to find candidate keys using FDs.
If the closure of a set of attributes, X, in a relation is equal to the set of all the attributes in the relation, we know that X is a superkey of the relation.
If we cannot find a proper subset of X such that the closure of the proper subset is equal to the set of all the attributes in the relation, we know that X is a candidate key of the relation.
This algorithm is given as Algorithm 13.1 on page 370 of the First Edition; as Algorithm 12.1 on page 405 of the Second Edition; as Algorithm 14.1 on page 481 of the Third Edition; as Algorithm 10.1 on page 310 of the Fourth Edition; and as Algorithm 10.1 on page 353 of the Fifth Edition.
Algorithm <n>.1: Determining X+, the Closure of X under F
X+ := X;
repeat
oldX+ := X+;
for each functional dependency Y → Z in F do
if X+ ⊇ Y then X+ := X+ ∪ Z;
until (X+ = oldX+);
![]()
We are given a relation
R(A,B,C,D,E)
and the functional dependencies (FDs) amongst its attributes
F = {A -> B
AC -> D
B -> E}
and asked to find the candidate key(s) of R.
Working:I could apply the above algorithm to each and every subset of R,
butI know that a *good* place to start looking for candidate keys is amongst the determinants in F.
A determinate is the LHS (Left Hand
Side) of an FD.
If the closure of a determinant is equivalent to R (if it is not a proper subset of R) we know that the determinant is a superkey of R and may therefore be a candidate key of R.
Using Algorithm <n>.1 to see if determinate A is a superkey
X+ := {A}
oldX+:= {A}
for A -> B,
A is a subset of X+
so B is added to X+
for AC -> D,
AC is not a subset of X+
for B -> E,
B is a subset of X+
so E is added to X+
oldX+ != X+ repeat
oldX+:= {A,B,E}
for A -> B,
A is a subset of X+, B is already in X+
for AC -> D,
AC is not a subset of X+
for B -> E,
B is a subset of X+, E is already in X+
X+ = oldX+ stop
|
X+ is {A}
X+ is now {A,B}
X+ is still {A,B}
X+ is now {A,B,E}
X+ is still {A,B,E}
X+ is still {A,B,E}
X+ is still {A,B,E}
|
The closure of {A} over F is {A,B,E} which is a proper subset of R.
Therefore, {A} is not a superkey of R and cannot be a candidate key for R.
Using Algorithm <n>.1 to see if determinate AC is a superkey
X+ := AC
oldX+:= {A,C}
for A -> B,
A is a subset of X+
so B is added to X+
for AC -> D,
AC is a subset of X+
so D is added to X+
for B -> E,
B is a subset of X+
so E is added to X+
oldX+ != X+ repeat
oldX+:= {A,C,B,D,E}
for A -> B,
A is a subset of X+, B is already in X+
for AC -> D,
AC is not a subset of X+
for B -> E,
B is a subset of X+, E is already in X+
X+ = oldX+ stop
|
X+ is {A,C}
X+ is now {A,C,B}
X+ is now {A,C,B,D}
X+ is now {A,C,B,D,E}
X+ is still {A,C,B,D,E}
X+ is still {A,C,B,D,E}
X+ is still {A,C,B,D,E}
|
The closure of {A,C} over F is {A,B,C,D,E} which is not a proper subset of R.
{A,C} is a superkey for R and may be a candidate key for R.
Using Algorithm <n>.1 to see if determinate B is a superkey
X+ := B
oldX+:= {B}
for A -> B,
A is a not a subset of X+
for AC -> D,
AC is not a subset of X+
for B -> E,
B is a subset of X+
so E is added to X+
oldX+ != X+ repeat
oldX+:= {B,E}
for A -> B,
A is a subset of X+
for AC -> D,
AC is not a subset of X+
for B -> E,
B is a subset of X+,
E is already in X+
X+ = oldX+ stop
|
X+ is {B}
X+ is still {B}
X+ is still {B}
X+ is now {B,E}
X+ is still {B,E}
X+ is still {B,E}
X+ is still {B,E}
|
The closure of {B} over F is {B,E} which is a proper subset of R.
Therefore, {B} is not a superkey of R and cannot be a candidate key for R.
Using Algorithm <n>.1 , I have found one superkey {A,C} for R.
Connolly & Begg define a candidate key as
A superkey such that no proper subset is a superkey within the relation.
So, I need to check the proper subsets of the superkey I have found to see if any of these subsets are superkeys.
The proper subsets of {A,C} are
A candidate key, which may be chosen to be a primary key, cannot be null.
I have already discovered above that {A} is not a superkey for R.
So I need to consider only {C} as a possible superkey.
Using Algorithm <n>.1 to see if C is a superkey
X+ := C
oldX+:= {C}
for A -> B,
A is a not a subset of X+
for AC -> D,
AC is not a subset of X+
for B -> E,
B is not a subset of X+
oldX+ = X+ stop
|
X+ is {C}
X+ is still {C}
X+ is still {C}
X+ is still {C}
|
The closure of {C} over F is {C} which is a proper subset of R.
Therefore, {C} is not a superkey of R and cannot be
a candidate key for R.
I have discovered that no proper subsets of {A,C} are superkeys of R, so {A,C} is a candidate key for R.
Are there any other candidate keys for R?
I could use Algorithm 12.1/14.1 for
D and E, but I am exhausted.
I know that any attribute which is in none of the
determinants in F cannot be a candidate key.
If you would like to see an exhaustive application of Algorithm <n>.1 , you can find all the subsets of R and apply the algorithm to each subset.
I have found only one candidate key {A,C} which I must select as the primary key for R.
Therefore, the relational schema for R is
R (A, C, [pk] B, D, E)
![]()
|
URL: http://computer/people/Carol.Edmondson/theory/FindingCandidateKeys.shtml
Last modified: 18 September 2006 18:07:21 EST |