|
|
SQL / SQL Table Joins and Sub-Queries
Joining Multiple SQL Tables
|
A single table in most cases returns incomplete information. Thus, we need to combine data from two or more tables in order to see a more accurate result.
|
So far we have been selecting information from only one table using the SELECT statement.
That
is where joins come in - they allow us to combine two or more tables in
order to retrieve the data from those tables, in a variety of different
ways. Usually, a query has to access two or more tables in order to
find all information it requires. SQL gives us this possibility to join
tables to select mixed information from different tables.
Two
tables are joined in a SELECT query in order to produce a single result
set. The tables are not merged and remain independent after the
execution of the SELECT query.
All or some of their columns can appear together in one single SELECT statement by a single query. When we are observing in a SQL SELECT statement more than one tables after the FROM keyword then we have a "Join" SQL query.
The
join is successful, when data in the second table is combined with the
first table to produce a set of rows (records) containing data from both
of them. This is the way of accessing relational databases.
Data
which is stored into tables as entities or objects we need very often
to split into multiple physical entities and combine them to compose
more specific information. Thus, the term "Join" makes the database management systems "relational". The basic syntax format is:
SELECT "Column1,Column2,...,ColumnN" FROM table1,table2 WHERE "search-condition(s)"
Cross Join or Cartesian Product
A CROSS JOIN is the size of a Cartesian product which is the number of rows in one table multiplied by those in the other.
Thus,
for two tables with four rows each, their CROSS JOIN or Cartesian
product would consist of sixteen rows.
By definition, CROSS JOINs do not
need or support the use of the ON clause that other joins require.
Here is a CROSS JOIN of the customers and orders tables:
As we have mentioned previously the general format for a SELECT statement using the WHERE clause is:
SELECT FROM WHERE ;
Consider the following tables:
"Customers" table:
"Purchases" table:
Thus, in order to get all records of one table like the table "Customers" we use the following SQL query.
SELECT * FROM Customers;
Using the WHERE clause we can filter out rows or columns of this table as follows:
SELECT * FROM Customers WHERE Credit >= 18;
This will return the following subset of records from the Customers table.
And if we want to specify a list of columns to be shown instead of the all-purpose asterisk we can write:
SELECT LastName,Credit FROM Customers WHERE Credit >= 18;
Which will give us:
 Very often we need to combine data from two or more tables in order to manage their data, in a variety of different ways.
The CROSS JOIN is performed when we use the SELECT statement for two or more tables as follows:
SELECT * FROM Customers, Purchases;
This code will return a CROSS JOINED table whose number of rows will be equal to the product of the number of rows in each of the tables used in the join.
Thus, in this case the new joined table will have 28 rows, as the Customers table has 7 rows and Purchases table has 4 rows (7x4=28).
This join procedure is called CROSS JOIN and produces a result that contains all possible combinations of all joined columns from both tables.
As you can imagine, through CROSS JOIN process huge implications are born for the performance of database server and the problem gets bigger when we add a third table.
This might not seem like a big problem when all we are dealing with are three tables containing a total of 18 records (2x3x3=18), but think what would happen if we had three tables, each containing 300 records, and we decided to cross join them.
Types of Joins
As
we have mentioned earlier a join between two tables is established by
linking a column or columns in one table with those in another.
The condition that must be satisfied to join the two tables is an expression which is called the join condition.
There are three basic types of joins,
• Inner joins • Outer joins (left and right) • Self joins
All of them are based on linear algebra (matrix algebra). The Outer joins include rows in the result set even when the join condition does not return true. An Inner join does not return any set of rows if the join condition is not satisfied.
The Inner join combines every record in the first table with its corresponding records in the second table using the common column (key).
This is the most common type of join, and we explained it in our previous JOIN example.
The main Inner join rule is that if a record in one table does not have a match in the other table, this record is ignored.
Here is the ANSI version of the Inner join query:
SELECT TableA.Column1,TableA.Column2,TableB.Column1,TableB.Column3 FROM TableA JOIN TableB ON (TableA.ColumnKEY=TableB.ColumnKEY);
The Outer join is similar to an Inner join except that if there are no corresponding records to join, a blank record is returned.
These joins come in both left and right "flavors", which basically indicate which table is to be returned in its entirety. Thus, the order of the terms in the WHERE clause is significant when constructing multilevel joins.
Here is the LEFT OUTER JOIN query written using valid ANSI join syntax:
SELECT TableA.Column1, TableB.Column3, TableC.Column7 FROM TableA LEFT OUTER JOIN TableB ON (TableA.ColumnKEY=TableB.ColumnKEY) LEFT OUTER JOIN TableC ON (TableB.KEYnum=TableC.KEYnum);
Here is the RIGHT OUTER JOIN query written using valid ANSI join syntax:
SELECT TableA.ColumnKEY+2, TableB.Column3 FROM TableA RIGHT OUTER JOIN TableA ON (TableA.ColumnKEY+2=TableB.ColumnKEY);
The code above is the earlier LEFT OUTER JOIN query restated.
The Self join is the type through which a table is joined to itself.
This type of join has a more advanced structure. It is usually used when we have data about one record in the table that requires data about another record in the same table.
For
these complicated situations you have to create links (keys) as many as
the separate data (fields) you want your queries to access.
|
|