- Composite Key in SQL: Your Ultimate Guide to Mastery
- Table of Contents
- What Is a Composite Key in SQL?В
- Professional Certificate Program in Data Analytics
- When Does the Composite Key Come to the Picture?В
- PostgreSQL Foreign Key Constraint
- Define Foreign Key while Creating a Table
- NO ACTION – Raise an Error on Delete or Update
- RESTRICT
- SET NULL — Set Referencing Column to NULL
- SET DEFAULT
- CASCADE
- Adding Foreign Key to an Existing Table
- Postgresql composite foreign key
- 5.3.1.В Check Constraints
- 5.3.2.В Not-Null Constraints
- 5.3.3.В Unique Constraints
- 5.3.4.В Primary Keys
- 5.3.5.В Foreign Keys
- 5.3.6.В Exclusion Constraints
Composite Key in SQL: Your Ultimate Guide to Mastery
Table of Contents
What Is a Composite Key in SQL?В
A composite key in SQL can be defined as a combination of multiple columns, and these columns are used to identify all the rows that are involved uniquely. Even though a single column can’t identify any row uniquely, a combination of over one column can uniquely identify any record. In other words, the combination key can also be described as a primary key that is being created by using multiple columns. However, the data types of different columns could differ from each other. You can also combine all the foreign keys to create a composite key in SQL.
Syntax to declare a composite key in SQL:
—— Syntax to create a composite key ——
—— by combining 3 columns: COL1, COL2, COL3 ——
CONSTRAINT COMPOSITE_KEY_NAME PRIMARY KEY (COL1, COL2, COL3)
- COMPOSITE_KEY_NAME: This is the name of the new composite key created by combining two or more columns.В
Note that the data type of all the columns in a composite key can be different.
- Columns: The latest version of SQL supports the combining of more than 16 columns. The data type of the columns combined to make a composite key can be different for all the columns.В
Professional Certificate Program in Data Analytics
When Does the Composite Key Come to the Picture?В
You already saw that the composite keys are used to identify all the rows that are involved uniquely. Composite keys in SQL prove to be useful in those cases where you have a requirement of keys that can uniquely identify records for better search purposes, but you do not possess any single unique column. In such cases, you must combine multiple columns to create a unique key.
Now, understand this concept with an example for better understanding. Suppose you are handling the data of the employees of a company and you want to search for an employee named Rahul with his name in your database. But in such cases of searching by name, there is a high possibility that more than one employee shares the same name. This happened in this case, too. You ran into multiple employees that share the name, Rahul. Now to overcome this issue, since you already know that the employee numbers are always unique, you can optimize your search by considering the name column along with the column of employee number as a single column. This scenario is one of the use cases of Composite keys.В
Syntax to create a composite key for a table in SQL:
—- Syntax to create a composite key for a table
—- by combining some columns
Create table table_name (
COL1 data_type_1 NOT NULL,
COL2 data_type_2 NOT NULL,
—— Declare the composite key ——
—- here COL1, COL3, and COL4 are —-
—- forming up the composite key —-
CONSTRAINT COMP_NAME PRIMARY KEY (COL1, COL3, COL4)В В В В В
Consider an example in which you will understand the concept of composite keys in SQL using a STUDENT table.
PostgreSQL Foreign Key Constraint
In PostgreSQL, the foreign key is a column(s) in a table that points to a primary key or unique key column in the same or another table.
Foreign key establishes referential integrity between the parent and child tables. The table that has a foreign key is called the Child table and the table that has a primary key or unique key column that is being referenced by the foreign key is called the Parent table.
For example, the following employee table has a foreign key column dept_id that links to a primary key column dept_id in the department table. Thus, it forms one-to-many relationships between the employee and department table, which means for one department there could be multiple employees. In other words, multiple records in the employee table can contain the same dept_id that points to one dept_id value in the department table.
Define Foreign Key while Creating a Table
You can define a foreign key when you create a table using CREATE TABLE statement.
In the above syntax,
- Use the CONSTRAINT keyword to define a constraint and then the name of the foreign key constraint. The constraint name is optional; if you do not specify it, PostgreSQL will give the name as per the default naming convention.
- Specify one or more column names of the table on which you want to define foreign key constraint after the FOREIGN KEY keyword.
- The REFERENCES keyword is used to specify the parent table and parent table columns which are referenced by a foreign key in the current table.
- The ON DELETE and ON UPDATE clauses are optional. These actions determine the behavior when a primary key is deleted or updated in the parent table.
PostgreSQL supports the following referential actions:
- NO ACTION
- RESTRICT
- SET NULL
- SET DEFAULT
- CASCADE
The following example demonstrates creating a foreign key in the employee table that points to the department table.
In the above example, the dept_id column in the employee table is defined as a foreign key column that references the primary key column dept_id of the department table. The CONSTRAINT FK_employee_department specifies the foreign key name FK_employee_department , FOREIGN KEY(dept_id) specifies the foreign key column in the employee table, and REFERENCES department(dept_id) specifies that the foreign key column refers to the dept_id column of the department table.
The above foreign key established a one-to-many relation between department and employee table where a department can have zero or more employees, and one employee cannot have more than one department.
Notice that we did not define any action such as ON DELETE or ON UPDATE clause. So, it will consider the default NO ACTION .
Note: The foreign key column name does not need to be the same as a primary key column, but it’s advisable to do so for readability purposes.
NO ACTION – Raise an Error on Delete or Update
The NO ACTION referential action is the default action if ON DELETE or ON UPDATE clause is not specified. The NO ACTION produces an error indicating that the deletion or update would create a foreign key constraint violation.
The following example demonstrates the NO ACTION referential action.
Now let’s delete a department with dept_id = 1 , as shown below.
We are trying to delete a row in the department table where dept_id = 1 , but two employees in the employee table belong to that department. So, PostgreSQL raises a foreign key constraint violation error and will not allow deletion of the department .
To delete a row in the department table, you need to delete all employees who belong to that department and then delete a department.
RESTRICT
The RESTRICT action is the same as NO ACTION. The difference is when you define the foreign key constraint as DEFERRABLE with an INITIALLY DEFERRED or INITIALLY IMMEDIATE mode.
SET NULL — Set Referencing Column to NULL
When a foreign key is created with ON DELETE SET NULL or ON UPDATE SET NULL , then on delete or update of data in the parent table, the foreign key column of referencing row in the child table will be automatically set to NULL.
The following example demonstrates the SET NULL action.
Now, let’s insert data into the above tables.
Now try to delete department where dept_id = 1 , as shown below.
We defined foreign key constraint with ON DELETE SET NULL clause, so two referencing rows in the employee table whose dept_id was 1 are now set to NULL. Let’s check the data in the employee table.
SET DEFAULT
When a foreign key is created with ON DELETE SET DEFAULT or ON UPDATE SET DEFAULT , then on deleting or updating data in the parent table, the foreign key column of referencing row in the child table will be automatically set to the default value if specified any. There must be a row in the referenced table matching the default values if they are not null, or the operation will fail.
The following example demonstrates the SET DEFAULT action:
Insert data into the above tables:
Now try to delete department with dept_id = 1 , as shown below.
As you can see, it allowed deletion of the department . We defined foreign key constraint with ON DELETE SET DEFAULT , so referencing row with emp_id = 1 in employee table whose dept_id was 1 is now set to DEFAULT value which is 3. Let’s check the data in the employee table.
Note that there is default value 3 specified in the employee table for the dept_id column. If no default value is specified for dept_id in the employee table, then the above deletion will set the value as NULL.
The dept_id 3 must exist in the department table; otherwise, an error will be raised. For example, if you specify 4 as the default value of dept_id in the employee table then trying to delete a row in the department table would raise an error, as shown below.
CASCADE
When a foreign key is created with ON DELETE CASCADE or ON UPDATE CASCADE , then on delete or update of a referenced row in the parent table, the foreign key row of referencing row in the child table will be automatically deleted.
The following example demonstrates the CASCADE action:
Now, insert data into tables.
Now try to delete a department where dept_id = 1 .
The above DELETE statement was executed successfully and allowed deletion of a row in the department table. Because of ON DELETE CASCADE option, all the referencing rows in the employee table will be deleted. Let’s check the data in the employee table.
As you can see, there is an employee with emp_id = 1 belonging to the ‘HR’ department. On deletion of ‘HR’ department with emp_id = 1 is deleted from the employee table.
Adding Foreign Key to an Existing Table
A foreign key constraint can be added to one or more columns of the existing table. If the table you are adding foreign key constraint contains data, then that column or set of columns must have matching values with referencing column of the Parent table, otherwise, it will not allow adding a constraint.
Assume we have department and employee table as bellow without any parent-child relationship defined between them.
Note that one employee does not belong to any department that has dept_id = NULL . Now we will add a foreign key constraint on the dept_id column of the employee table.
The above will create a foreign key in the existing employee table.
Postgresql composite foreign key
Data types are a way to limit the kind of data that can be stored in a table. For many applications, however, the constraint they provide is too coarse. For example, a column containing a product price should probably only accept positive values. But there is no standard data type that accepts only positive numbers. Another issue is that you might want to constrain column data with respect to other columns or rows. For example, in a table containing product information, there should be only one row for each product number.
To that end, SQL allows you to define constraints on columns and tables. Constraints give you as much control over the data in your tables as you wish. If a user attempts to store data in a column that would violate a constraint, an error is raised. This applies even if the value came from the default value definition.
5.3.1.В Check Constraints
A check constraint is the most generic constraint type. It allows you to specify that the value in a certain column must satisfy a Boolean (truth-value) expression. For instance, to require positive product prices, you could use:
As you see, the constraint definition comes after the data type, just like default value definitions. Default values and constraints can be listed in any order. A check constraint consists of the key word CHECK followed by an expression in parentheses. The check constraint expression should involve the column thus constrained, otherwise the constraint would not make too much sense.
You can also give the constraint a separate name. This clarifies error messages and allows you to refer to the constraint when you need to change it. The syntax is:
So, to specify a named constraint, use the key word CONSTRAINT followed by an identifier followed by the constraint definition. (If you don’t specify a constraint name in this way, the system chooses a name for you.)
A check constraint can also refer to several columns. Say you store a regular price and a discounted price, and you want to ensure that the discounted price is lower than the regular price:
The first two constraints should look familiar. The third one uses a new syntax. It is not attached to a particular column, instead it appears as a separate item in the comma-separated column list. Column definitions and these constraint definitions can be listed in mixed order.
We say that the first two constraints are column constraints, whereas the third one is a table constraint because it is written separately from any one column definition. Column constraints can also be written as table constraints, while the reverse is not necessarily possible, since a column constraint is supposed to refer to only the column it is attached to. ( PostgreSQL doesn’t enforce that rule, but you should follow it if you want your table definitions to work with other database systems.) The above example could also be written as:
It’s a matter of taste.
Names can be assigned to table constraints in the same way as column constraints:
It should be noted that a check constraint is satisfied if the check expression evaluates to true or the null value. Since most expressions will evaluate to the null value if any operand is null, they will not prevent null values in the constrained columns. To ensure that a column does not contain null values, the not-null constraint described in the next section can be used.
PostgreSQL does not support CHECK constraints that reference table data other than the new or updated row being checked. While a CHECK constraint that violates this rule may appear to work in simple tests, it cannot guarantee that the database will not reach a state in which the constraint condition is false (due to subsequent changes of the other row(s) involved). This would cause a database dump and restore to fail. The restore could fail even when the complete database state is consistent with the constraint, due to rows not being loaded in an order that will satisfy the constraint. If possible, use UNIQUE , EXCLUDE , or FOREIGN KEY constraints to express cross-row and cross-table restrictions.
If what you desire is a one-time check against other rows at row insertion, rather than a continuously-maintained consistency guarantee, a custom trigger can be used to implement that. (This approach avoids the dump/restore problem because pg_dump does not reinstall triggers until after restoring data, so that the check will not be enforced during a dump/restore.)
PostgreSQL assumes that CHECK constraints’ conditions are immutable, that is, they will always give the same result for the same input row. This assumption is what justifies examining CHECK constraints only when rows are inserted or updated, and not at other times. (The warning above about not referencing other table data is really a special case of this restriction.)
An example of a common way to break this assumption is to reference a user-defined function in a CHECK expression, and then change the behavior of that function. PostgreSQL does not disallow that, but it will not notice if there are rows in the table that now violate the CHECK constraint. That would cause a subsequent database dump and restore to fail. The recommended way to handle such a change is to drop the constraint (using ALTER TABLE ), adjust the function definition, and re-add the constraint, thereby rechecking it against all table rows.
5.3.2.В Not-Null Constraints
A not-null constraint simply specifies that a column must not assume the null value. A syntax example:
A not-null constraint is always written as a column constraint. A not-null constraint is functionally equivalent to creating a check constraint CHECK ( column_name IS NOT NULL) , but in PostgreSQL creating an explicit not-null constraint is more efficient. The drawback is that you cannot give explicit names to not-null constraints created this way.
Of course, a column can have more than one constraint. Just write the constraints one after another:
The order doesn’t matter. It does not necessarily determine in which order the constraints are checked.
The NOT NULL constraint has an inverse: the NULL constraint. This does not mean that the column must be null, which would surely be useless. Instead, this simply selects the default behavior that the column might be null. The NULL constraint is not present in the SQL standard and should not be used in portable applications. (It was only added to PostgreSQL to be compatible with some other database systems.) Some users, however, like it because it makes it easy to toggle the constraint in a script file. For example, you could start with:
and then insert the NOT key word where desired.
In most database designs the majority of columns should be marked not null.
5.3.3.В Unique Constraints
Unique constraints ensure that the data contained in a column, or a group of columns, is unique among all the rows in the table. The syntax is:
when written as a column constraint, and:
when written as a table constraint.
To define a unique constraint for a group of columns, write it as a table constraint with the column names separated by commas:
This specifies that the combination of values in the indicated columns is unique across the whole table, though any one of the columns need not be (and ordinarily isn’t) unique.
You can assign your own name for a unique constraint, in the usual way:
Adding a unique constraint will automatically create a unique B-tree index on the column or group of columns listed in the constraint. A uniqueness restriction covering only some rows cannot be written as a unique constraint, but it is possible to enforce such a restriction by creating a unique partial index.
In general, a unique constraint is violated if there is more than one row in the table where the values of all of the columns included in the constraint are equal. However, two null values are never considered equal in this comparison. That means even in the presence of a unique constraint it is possible to store duplicate rows that contain a null value in at least one of the constrained columns. This behavior conforms to the SQL standard, but we have heard that other SQL databases might not follow this rule. So be careful when developing applications that are intended to be portable.
5.3.4.В Primary Keys
A primary key constraint indicates that a column, or group of columns, can be used as a unique identifier for rows in the table. This requires that the values be both unique and not null. So, the following two table definitions accept the same data:
Primary keys can span more than one column; the syntax is similar to unique constraints:
Adding a primary key will automatically create a unique B-tree index on the column or group of columns listed in the primary key, and will force the column(s) to be marked NOT NULL .
A table can have at most one primary key. (There can be any number of unique and not-null constraints, which are functionally almost the same thing, but only one can be identified as the primary key.) Relational database theory dictates that every table must have a primary key. This rule is not enforced by PostgreSQL , but it is usually best to follow it.
Primary keys are useful both for documentation purposes and for client applications. For example, a GUI application that allows modifying row values probably needs to know the primary key of a table to be able to identify rows uniquely. There are also various ways in which the database system makes use of a primary key if one has been declared; for example, the primary key defines the default target column(s) for foreign keys referencing its table.
5.3.5.В Foreign Keys
A foreign key constraint specifies that the values in a column (or a group of columns) must match the values appearing in some row of another table. We say this maintains the referential integrity between two related tables.
Say you have the product table that we have used several times already:
Let’s also assume you have a table storing orders of those products. We want to ensure that the orders table only contains orders of products that actually exist. So we define a foreign key constraint in the orders table that references the products table:
Now it is impossible to create orders with non-NULL product_no entries that do not appear in the products table.
We say that in this situation the orders table is the referencing table and the products table is the referenced table. Similarly, there are referencing and referenced columns.
You can also shorten the above command to:
because in absence of a column list the primary key of the referenced table is used as the referenced column(s).
You can assign your own name for a foreign key constraint, in the usual way.
A foreign key can also constrain and reference a group of columns. As usual, it then needs to be written in table constraint form. Here is a contrived syntax example:
Of course, the number and type of the constrained columns need to match the number and type of the referenced columns.
Sometimes it is useful for the “ other table ” of a foreign key constraint to be the same table; this is called a self-referential foreign key. For example, if you want rows of a table to represent nodes of a tree structure, you could write
A top-level node would have NULL parent_id , but non-NULL parent_id entries would be constrained to reference valid rows of the table.
A table can have more than one foreign key constraint. This is used to implement many-to-many relationships between tables. Say you have tables about products and orders, but now you want to allow one order to contain possibly many products (which the structure above did not allow). You could use this table structure:
Notice that the primary key overlaps with the foreign keys in the last table.
We know that the foreign keys disallow creation of orders that do not relate to any products. But what if a product is removed after an order is created that references it? SQL allows you to handle that as well. Intuitively, we have a few options:
Disallow deleting a referenced product
Delete the orders as well
To illustrate this, let’s implement the following policy on the many-to-many relationship example above: when someone wants to remove a product that is still referenced by an order (via order_items ), we disallow it. If someone removes an order, the order items are removed as well:
Restricting and cascading deletes are the two most common options. RESTRICT prevents deletion of a referenced row. NO ACTION means that if any referencing rows still exist when the constraint is checked, an error is raised; this is the default behavior if you do not specify anything. (The essential difference between these two choices is that NO ACTION allows the check to be deferred until later in the transaction, whereas RESTRICT does not.) CASCADE specifies that when a referenced row is deleted, row(s) referencing it should be automatically deleted as well. There are two other options: SET NULL and SET DEFAULT . These cause the referencing column(s) in the referencing row(s) to be set to nulls or their default values, respectively, when the referenced row is deleted. Note that these do not excuse you from observing any constraints. For example, if an action specifies SET DEFAULT but the default value would not satisfy the foreign key constraint, the operation will fail.
Analogous to ON DELETE there is also ON UPDATE which is invoked when a referenced column is changed (updated). The possible actions are the same. In this case, CASCADE means that the updated values of the referenced column(s) should be copied into the referencing row(s).
Normally, a referencing row need not satisfy the foreign key constraint if any of its referencing columns are null. If MATCH FULL is added to the foreign key declaration, a referencing row escapes satisfying the constraint only if all its referencing columns are null (so a mix of null and non-null values is guaranteed to fail a MATCH FULL constraint). If you don’t want referencing rows to be able to avoid satisfying the foreign key constraint, declare the referencing column(s) as NOT NULL .
A foreign key must reference columns that either are a primary key or form a unique constraint. This means that the referenced columns always have an index (the one underlying the primary key or unique constraint); so checks on whether a referencing row has a match will be efficient. Since a DELETE of a row from the referenced table or an UPDATE of a referenced column will require a scan of the referencing table for rows matching the old value, it is often a good idea to index the referencing columns too. Because this is not always needed, and there are many choices available on how to index, declaration of a foreign key constraint does not automatically create an index on the referencing columns.
More information about updating and deleting data is in ChapterВ 6. Also see the description of foreign key constraint syntax in the reference documentation for CREATE TABLE .
5.3.6.В Exclusion Constraints
Exclusion constraints ensure that if any two rows are compared on the specified columns or expressions using the specified operators, at least one of these operator comparisons will return false or null. The syntax is:
Adding an exclusion constraint will automatically create an index of the type specified in the constraint declaration.