V R MSBI-Xperts: 2013

Saturday, 12 October 2013

MSBI Best Sites:

MSBI:

1. http://msbiacademy.com

SQL Server

SQL DBA:

Wednesday, 18 September 2013

SQL Server COMPLEX QUERIES ( Not Yet Written)

1. How can you change a column values from "male" to "female" with Sql Query?

T-SQL Important Syntaxes (Not Yet Written)

Stored Procedure Performance Tuning / Stored Procedure Optimization techniques

Questions will be like this:
Q) How to tune SP and wt is the need to Tune SP?
OR Last week my SP was running perfectly and why it is running too slowly?

Ans :

To tune SP we have to follow the various steps .. those are give below

1. Use Schema Name for Objects in SPs

2. Stored procedure should not start with "SP_" letters

3.Use "SP_ExecuteSql" stored procedure instead of "Execute" statement.

4. Try to avoid SQL Server Cursors when it is possible

5. Use "SET NOCOUNT ON" statement in SP.

6. Try to Drop TempTables to clear Temporary memory to work effectively

7. Do Error Handling with " TRY , CATCH" keywords whenever your required

8. We should create an index on any table (permanent or temporary table) if that is usable in more than one statements.

9. Use IFEXISTS (SELECT 1) instead of (SELECT * )

10. Use DBCC Commands with the following statements
DBCC FreeProcCache;
DBCC DropCleanBuffers;

Go
SET Statistics Time ON;
SET Statistics IO ON;
Go

Exec <Your SP> [Args]

11. Drop/Release objects whichever created and used in that procedure bloc

12. Try to avoid don't use many SubQueries instead use Inner Join

13. Use Query Execution Plan to check time taken for query to get executed.

14. Instead of Cursor use TempTable in the Looping.

15. Try to avoid Temporary tables if possible , instead use CTE or TableVariable if no of rows are less (like below 1000 rows)

------------------------------------------------------------------------------------------------------------

Advance Stored Procedure
The main advantage of stored procedure is, to execute T-SQL statements in less time than the similar set of T-SQL statements is executed individually. The reason to take less time is that the query execution plan for the stored procedures is already stored in the "sys.procedures" system defined view.

Find the most costly statements
Determine why the statement is costly
Get an accurate baseline for the procedure
Optimize

Find the Most Costly Statments

There are two main ways to determine costly queries. One is to execute the procedure including an ActualExecution Plan. The other is to get a time difference of a before and after for each statement. The are also other ways (including IO usage and CPU usage but we won’t cover those here).

I like to start with the execution plan method. Though this is not the most precise method, it is the easiest.. so let’s start there. Here is how to find the slowest query in a procedure using the execution plan.

Determine why the statement is costly

This can be them most difficult task. With the use of the execution plan, we can help deduce the most common issues.

Now that we’ve found the statement, we need to drill down even further to find out what is causing the statement to be slow. This is done by looking at the cost for each operation in the execution plan. Under each operation, you will see a “cost: x%”. You want to find the operation with the highest percentage and focus on that.

When you do find it, many times it will be one of the following operations:

Interpreting the Execution Plan
Symptom	Cause(s)	example
Table scan, index scan,clustered index scan	Missing or improper Indexes, cardinality < 5%
thick lines (arrows) from one operation to another	Bad Joins, missing filter operation
RID Lookups, Key Lookups	Data is not at the leaf level, can use include clause for indexes
Paralellism	Bad join, improper order of operations, maxdop or parallel threshold set too low on server
Compute Scalar	Joining on differing data types, implicit conversion required

Scans

Scan operations are not bad when the cardinality of the column being searched is less than approx 6%, or when the amount of records being scanned is below around 500 records. Otherwise, if you have a scan that has a high cost associated with it, you will want to find the appropriate indexes to fix this issue.

Large Work Tables / Results

This is the issue above that is depicted by thick arrows. When this happens a lot of times it is a bad order of operations. What that means is that the optimizer did not choose the limit the overall result set by joining the smallest tables first. Instead, it is trying to join and filter from the largest tables, and then join the smaller tables. While this may not be the only scenario this happens in, it does happen often.

To fix this, you may consider breaking the query up and dump results into a temp table prior joining on the rest of the tables. Make sure to index your temp table!

RID Lookups / Key Lookups

These are also not always bad. In fact you cannot get around this issue all the time. Why? Because you can only have one clustered index, and you don’t want to include every column in the table in all your non clustered indexes. However if you run into this issue having a high cost, you will want to consider changing your clustered index, or adding the columns being looked up in the Key Lookup using the INCLUDE statement for indexes.

Parallelism

Parallelism can have a significant impact on queries and your server. What this means is that the amount of data being usurped by your query is very large, so in order to speed it up, SQL Server thinks it would be best to create more SPIDs to handle the operation. When this happens look for your page life expectency to dip low and your disk utilization to go high. Consider employing the same technique as above of breaking up your query. Also look at the order of operations and change the query around. If that doesn’t help, add the query hint OPTION (MAXDOP 1). Like this:

SELECT *
FROM dbo.mytable mt
JOIN dbo.yada y
ON mt.ID = y.ID
OPTION (MAXDOP 1)

Compute Scalar
This guy may fool you when it comes to the cost factor, however it can be a big hindrance to performance when this operation is between table joins. The “scalar” part should worry you, because it does perform somewhat of a row-by-row operation. When this is between table joins, this operation is there to do an implicit conversion of a datatype in order to join the columns correctly. Since it cannot join the two tables directly it will often force too many records to be touched in order to be converted. The conversion may then be dumped into a worktable (in the background) then that worktable will be used for joins after the conversion. Just make sure to join like data types.

Get an Accurate Baseline

Next before you make changes, get an accurate baseline prior to making changes so when you do make changes, you can be sure that they sped things up.

When you execute a procedure, the time it takes to execute is in the lower right-hand corner as shown below.

Run Duration

From the above, we can see the problem with only using the client side timer. It does not show milliseconds, and milliseconds do matter. The other problem that we cannot see above with the client-side timer is that it also includes the round trip time to the server and back to your computer. While this may seem like a more accurate depiction, it makes optimizing harder because you get less consistent results. In my opinion, it’s better to get the server-side timings.

To get the server-side timings, we turn on the time statistics, using set statistics time command.

We enable it using this:

SET STATISTICS TIME ON

Once executed, it stays on for the entire session (or window) until you close the window or turn it off (by replacing ON with OFF). So you only need to execute this one time, then you can remove the statement.

The next step needed to get consistent results is to clear the caches. One cache holds the compiled version of the SQL Procedure (the execution plan), the other cache holds the data pages that are retrieved from disk.

To clear the compiled execution plan, we use:

DBCC FREEPROCCACHE

To clear the data pages, we use:

DBCC DROPCLEANBUFFERS

Optimize

Now optimize and apply the changes discovered above! I know, easier said than done sometimes. Even though we did not cover all situations for query slowness, hopefully this will provide a good launching pad. We will delve further into fixes in a later article.

================================================================================================================================================
EXPLANATION:

DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;

GO

SET STATISTICS TIME ON;
SET STATISTICS IO ON;

GO

EXEC <my sproc> [args]

DBCC FREEPROCCACHE

Use DBCC FREEPROCCACHE to clear the procedure cache. Freeing the procedure cache would cause, for example, an ad-hoc SQL statement to be recompiled rather than reused from the cache. If observing through SQL Profiler, one can watch the Cache Remove events occur as DBCC FREEPROCCACHE goes to work. DBCC FREEPROCCACHE will invalidate all stored procedure plans that the optimizer has cached in memory and force SQL Server to compile new plans the next time those procedures are run.

DBCC DROPCLEANBUFFERS

Use DBCC DROPCLEANBUFFERS to test queries with a cold buffer cache without shutting down and restarting the server. DBCC DROPCLEANBUFFERS serves to empty the data cache. Any data loaded into the buffer cache due to the prior execution of a query is removed.

Above shameless stolen from: this blog post

SET STATISTICS TIME ON

Displays the number of milliseconds required to parse, compile, and execute each statement.

Sql Server Query Performance Tuning / Sql Query Optimization Tips in SQL Server

Tip 1: Always use WHERE Clause in SELECT Queries while we don’t need all the rows to be returned. This will help to narrow the return rows else it will perform a whole table scan and waste the Sql server resources with increasing thenetwork traffic. While scanning the whole it will lock the Table which may prevent other users to access the table.

Tip 2: It is seen many times developers use codes like

SELECT * FROM Table WHERE LOWER(Name)='India'

Instead of writing it like the below

SELECT * FROM Table WHERE Name='India'

Of course both the queries does the same work but 2nd one is better and retrieves rows more speedy than the first query. Because Sql Server is not case sensitive

Tip 3: While running a query, the operators used with the WHERE clause directly affect the performance. The operators shown below are in their decreasing order of their performance.

=
>,>=,<, <=
LIKE
<>

Tip 4 : When we are writing queries containing NOT IN, then this is going to offer poor performance as the optimizer need to use nested tables scan to perform this activity. This can be avoided by using EXISTS or NOT EXISTS.

When there is a choice to use IN or EXIST, we should go with EXIST clause for better performance.

Tip 5: It is always best practice to use the Index seek while the columns are covered by an index, this will force the Query Optimizer to use the index while using IN or OR clauses as a part of our WHERE clause.

SELECT * FROM Table
WHERE Status = 1 AND ID IN (406,530,956)

Takes more time than

SELECT * FROM Table (INDEX=IX_ID)
WHERE Status = 1 AND ID IN (406,530,956)

Tip 6: While we use IN, in the sql query it better to use one or more leading characters in the clause instead of using the wildcard character at the starting.

SELECT * FROM Table WHERE Name LIKE 'm%'

SELECT * FROM Table WHERE Name LIKE '%m'

In the first query the Query optimizer is having the ability to use an index to perform the query and there by reducing the load on sql server. But in the second query, no suitable index can be created while running the query.

Tip 7: While there is case to use IN or BETWEEN clause in the query, it is always advisable to use BETWEEN for better result.

SELECT * FROM Table
WHERE ID BETWEEN (5000 AND 5005)

Performs better than

SELECT * FROM Table
WHERE ID IN (5000,5001,5002,5003,5004,5005)

Tip 8: Always avoid the use of SUBSTRING function in the query.

SELECT * FROM Table WHERE Name LIKE 'n%'

Is much better than writing

SELECT * FROM Table WHERE SUBSTRING(Name,1,1)='n'

Tip 9 : The queries having WHERE clause connected by AND operators are evaluated from left to right in the order they are written. So certain things should be taken care of like

Provide the least likely true expressions first in the AND. By doing this if the AND expression is false at the initial stage the clause will end immediately. So it will save execution time
If all the parts of the AND expression are equally like being false then better to put the Complex expression first. So if the complex works are false then less works to be done.

Tip 10: Its sometimes better to combine queries using UNION ALL instead of using many OR clauses.

SELECT ID, FirstName, LastName FROM Table

WHERE City = 'Wichita' or ZIP = '67201' or State= 'Kansas'

The above query to use and index, it is required to have indexes on all the 3 columns.

The same query can be written as

SELECT ID, FirstName, LastName FROM Table WHERE City = 'Wichita'

UNION ALL

SELECT ID, FirstName, LastName FROM Table WHERE ZIP = '67201'

UNION ALL

SELECT ID, FirstName, LastName FROM Table WHERE State= 'Kansas'

Both the queries will provide same results but if there is only an index on City and no indexes on the zip or state, then the first query will not use the index and a table scan is performed. But the 2nd one will use the index as the part of the query.

Tip 11: While the select statement contains a HAVING clause, its better to make the WHERE clause to do most of the works (removing the undesired rows) for the Query instead of letting the HAVING clause to do the works.

e.g. in a SELECT statement with GROUP BY and HAVING clause, things happens like first WHERE clause will select appropriate rows then GROUP BY divide them to group of rows and finally the HAVING clause have less works to perform, which will boost the performance.

Tip 12: Let’s take 2 situations

A query that takes 30 seconds to run, and then displays all of the required results.
A query that takes 60 seconds to run, but displays the first screen full of records in less than 1 second.

By looking at the above 2 situations a developer may choose to follow the 1st option, as it uses less resources and faster in performance. But actually the 2nd one is more acceptable by a DBA. An application may provide immediate feedback to the user, but actually this may not be happening at the background.

We can use a hint like

SELECT * FROM Table WHERE City = 'Wichita' OPTION(FAST n)

where n = number of rows that we want to display as fast as possible. This hint helps to return the specified number of rows as fast as possible without bothering about the time taken by the overall query.

--------------------------------------------------------------------------------------------------------------------------

SQL Tuning/SQL Optimization Techniques:

1) The sql query becomes faster if you use the actual columns names in SELECT statement instead of than '*'.

For Example: Write the query as

SELECT id, first_name, last_name, age, subject FROM student_details;

Instead of:

SELECT * FROM student_details;

2) HAVING clause is used to filter the rows after all the rows are selected. It is just like a filter. Do not use HAVING clause for any other purposes.
For Example: Write the query as

SELECT subject, count(subject) 
FROM student_details 
WHERE subject != 'Science' 
AND subject != 'Maths' 
GROUP BY subject;

Instead of:

SELECT subject, count(subject) 
FROM student_details 
GROUP BY subject 
HAVING subject!= 'Vancouver' AND subject!= 'Toronto';

3) Sometimes you may have more than one subqueries in your main query. Try to minimize the number of subquery block in your query.
For Example: Write the query as

SELECT name 
FROM employee 
WHERE (salary, age ) = (SELECT MAX (salary), MAX (age) 
FROM employee_details) 
AND dept = 'Electronics';

Instead of:

SELECT name 
FROM employee
WHERE salary = (SELECT MAX(salary) FROM employee_details) 
AND age = (SELECT MAX(age) FROM employee_details) 
AND emp_dept = 'Electronics';

4) Use operator EXISTS, IN and table joins appropriately in your query.
a) Usually IN has the slowest performance.
b) IN is efficient when most of the filter criteria is in the sub-query.
c) EXISTS is efficient when most of the filter criteria is in the main query.

For Example: Write the query as

Select * from product p 
where EXISTS (select * from order_items o 
where o.product_id = p.product_id)

Instead of:

Select * from product p 
where product_id IN 
(select product_id from order_items

5) Use EXISTS instead of DISTINCT when using joins which involves tables having one-to-many relationship.
For Example: Write the query as

SELECT d.dept_id, d.dept 
FROM dept d 
WHERE EXISTS ( SELECT 'X' FROM employee e WHERE e.dept = d.dept);

Instead of:

SELECT DISTINCT d.dept_id, d.dept 
FROM dept d,employee e 
WHERE e.dept = e.dept;

6) Try to use UNION ALL in place of UNION.
For Example: Write the query as

SELECT id, first_name 
FROM student_details_class10 
UNION ALL 
SELECT id, first_name 
FROM sports_team;

Instead of:

SELECT id, first_name, subject 
FROM student_details_class10 
UNION 
SELECT id, first_name 
FROM sports_team;

7) Be careful while using conditions in WHERE clause.
For Example: Write the query as

SELECT id, first_name, age FROM student_details WHERE age > 10;

Instead of:

SELECT id, first_name, age FROM student_details WHERE age != 10;

Write the query as

SELECT id, first_name, age 
FROM student_details 
WHERE first_name LIKE 'Chan%';

Instead of:

SELECT id, first_name, age 
FROM student_details 
WHERE SUBSTR(first_name,1,3) = 'Cha';

Write the query as

SELECT id, first_name, age 
FROM student_details 
WHERE first_name LIKE NVL ( :name, '%');

Instead of:

SELECT id, first_name, age 
FROM student_details 
WHERE first_name = NVL ( :name, first_name);

Write the query as

SELECT product_id, product_name 
FROM product 
WHERE unit_price BETWEEN MAX(unit_price) and MIN(unit_price)

Instead of:

SELECT product_id, product_name 
FROM product 
WHERE unit_price >= MAX(unit_price) 
and unit_price <= MIN(unit_price)

Write the query as

SELECT id, name, salary 
FROM employee 
WHERE dept = 'Electronics' 
AND location = 'Bangalore';

Instead of:

SELECT id, name, salary 
FROM employee 
WHERE dept || location= 'ElectronicsBangalore';

Use non-column expression on one side of the query because it will be processed earlier.

Write the query as

SELECT id, name, salary 
FROM employee 
WHERE salary < 25000;

Instead of:

SELECT id, name, salary 
FROM employee 
WHERE salary + 10000 < 35000;

Write the query as

SELECT id, first_name, age 
FROM student_details 
WHERE age > 10;

Instead of:

SELECT id, first_name, age 
FROM student_details 
WHERE age NOT = 10;

8) Use DECODE to avoid the scanning of same rows or joining the same table repetitively. DECODE can also be made used in place of GROUP BY or ORDER BY clause.
For Example: Write the query as

SELECT id FROM employee 
WHERE name LIKE 'Ramesh%' 
and location = 'Bangalore';

Instead of:

SELECT DECODE(location,'Bangalore',id,NULL) id FROM employee 
WHERE name LIKE 'Ramesh%';

9) To store large binary objects, first place them in the file system and add the file path in the database.

10) To write queries which provide efficient performance follow the general SQL standard rules.

a) Use single case for all SQL verbs
b) Begin all SQL verbs on a new line
c) Separate all words with a single space
d) Right or left aligning verbs within the initial SQL verb

-----------------------------------------------------------------------------------------------------------------------------

Here are the different performance optimization tips and topics for this article:

Basic query optimization fundamentals
An example of Hash Match Aggregate versus Stream Aggregate
The Baker’s Dozen Spotlight: how the SQL Server 2012 Columnstore index can help
More information on the Columnstore index
Queries using dates - what works well, what doesn’t
Trying to outsmart the optimizer with NULL checks
Queries that are search-argument optimized and queries that aren’t
Are correlated subqueries bad?
New capabilities in SQL Server 2012 - are they faster?
Recursive queries versus loops
APPLY and Table-valued functions versus other approaches
Inserting and updating multiple rows in one procedure, using INSERT and UPDATE
Inserting and updating multiple rows in one procedure, using MERGE

Labels

Saturday, 12 October 2013

MSBI Best Sites:

Wednesday, 18 September 2013

SQL Server COMPLEX QUERIES ( Not Yet Written)

T-SQL Important Syntaxes (Not Yet Written)

Stored Procedure Performance Tuning / Stored Procedure Optimization techniques

Find the Most Costly Statments

Determine why the statement is costly

Get an Accurate Baseline

Optimize

Sql Server Query Performance Tuning / Sql Query Optimization Tips in SQL Server

SQL Tuning/SQL Optimization Techniques: