Sunday, November 28, 2010

Significant Uses of Joins And its Types in SQL Server 2005:


SQL Join is a clause that allows a select statement to access data from one or more table. It controls how the tables are linked one another and it is a qualifier of the From clause.
A primary key is a column with a unique value for each row. Each primary key value must be unique within the table. The purpose is to bind data together, across tables, without repeating all of the data in every table. You can join the tables only by matching the primary key columns.

There are three types of Joins:

Ø      Inner Join
Ø      Outer Join
Ø      Cross Join  

The key difference between them is that outer joins include rows in the result set even when the join condition is not met, while an inner join doesn't. When the join condition in an outer join is not met, columns in the first table are returned normally, but columns from the second table are returned with no value as NULLs. This is helpful for finding missing values and broken links between tables.

SQL Join syntax:

SELECT <column_name1>, <column_name2> FROM <table_name1>
JOIN <table_name2> ON <join_conditions>

Inner Join:

A JOIN that displays only rows that have a match in both the tables is known as inner JOIN.  It is a default type of joins in SQL Server

Syntax:

Select * from table1 t1 inner join table2 t2 on t1.col1= t2.col2

Example:

SELECT customers.customer_no, orders.Amount
FROM customers         
Inner JOIN orders
ON (customers.customer_no=orders.customer_no)

Customer_no       Amount
---------------------------------
1                              35641
2                              34565
3                              56987
4                              56834


Outer JOIN 

A JOIN that includes rows even if they do not have related rows in the joined table is an Outer JOIN.  You can create three different outer JOINs to specify the unmatched rows to be included:
Left Outer JOIN: In Left Outer JOIN, all rows in the first-named table, i.e. “left” table, which appears leftmost in the JOIN clause, are included. Unmatched rows in the right table do not appear.

Syntax: 

Select * from table1 t1 Left outer join table2 t2 on t1.col1= t2.col2

Example: Left outer join with three tables,

SELECT customers.customer_no, orders.Amount, items.description
FROM customers
LEFT OUTER JOIN orders
ON (customers.customer_no=orders.customer_no)
LEFT OUTER JOIN items
ON (orders.item_no=items.item_no)

Customer_no      Amount     description
-------------------------------------------------
1                              35641       Item 1
2                              24565       Item 2
3                              56987       Item 3
4                              56834       Item 4

Right Outer JOIN: In Right Outer JOIN, all rows in the second-named table, i.e. “right” table, which appears rightmost in the JOIN clause, are included. Unmatched rows in the left table are not included. The most advisable point is to avoid using right outer join to make the query perform better.

Syntax:

Select * from table1 t1 Right outer join table2 t2 on t1.col1= t2.col2

Example:

SELECT customers.customer_no+2
As customer_no, orders.Amount
FROM orders
RIGHT OUTER JOIN customers
ON(customers.customer_no+2=orders.Customer_no)

Customer_no        Amount
----------------------------------
3                               56987
4                               56834
5                               Null
6                                Null

Full Outer JOIN: In Full Outer JOIN, all rows in all the joined tables are included, whether they are matched or not. You can think of a FULL OUTER JOIN as the combination of a LEFT JOIN and a RIGHT JOIN.

Syntax:

Select * from table1 t1 Full outer join table2 t2 on t1.col1= t2.col2

Example:

SELECT customers.Customer_no+2, orders.Amount
FROM customers
FULL OUTER JOIN orders
ON(customers.customer_no+2=orders.customer_no)

Customer_no        Amount
----------------------------------
NULL                      35641
NULL                      24565
3                              56987
4                              56834
5                              NULL
6                              NULL

A CROSS JOIN, by contrast, is an intentional Cartesian product. The size of a Cartesian product is the number of rows in one table multiplied by those in the other. So for two tables with three rows each, their CROSS JOIN or Cartesian product would consist of nine rows. By definition, CROSS JOINs don't need or support the use of the ON clause that other joins require. Here's a CROSS JOIN of the customers and orders tables:

Syntax:


Select * from table1 t1 Cross join table2 t2

Example: 

SELECT customers.customer_no, orders.Amount
FROM orders
CROSS JOIN customers

Customer_no        Amount
----------------------------------
1                              35641
1                              24565
1                              56987
1                              56834
2                              35641
2                              24565
2                              56987
2                              56834
3                              35641
3                              24565
3                              56987
3                              56834
4                              35641
4                              24565
4                              56987
4                              56834


To make the query perform better, follow the points below,
  • One of the best ways to boost JOIN performance is to limit how many rows need to be JOINed.
  • If the columns used for the joins are not logically compact, then consider adding surrogate keys to the tables that are compact in order to reduce the size of the keys, thus decreasing read I/O during the join process, increasing overall performance.
  • JOIN performance has depends on how many rows you can store in a data page.
  • The performance of the JOIN will be faster only if there are no unnecessary spaces in a table.
  • This can be done through the use of a high fillfactor, rebuilding indexes often to get rid of empty space, and to optimize data types and widths when creating columns in tables.
  • An index on a foreign key column can extremely increase the performance of many joins.
  • Joins should be done on columns that have unique indexes.
  • The indexes on the columns to be joined must be numeric data types, to make the join perform faster.