A Series of Short Blog Posts on Db2 LUW Performance – Part 1
WRITTEN BY RAGHAVENDRA CHIYODU
1. Why Runstats is very useful for the Db2 optimizer? (Db2 LUW optimizer)
Runstats is vital utility for tables and indexes in the database. It is always good practice to have latest statistics so that the Db2 Optimizer gets the latest updates which helps in better performance of the tables and indexes.
So, whenever changes are happening in the table either by Insert/delete/update and other transactions then always use Runstats utility to update the statistics.
Scenario 1:
For example: Reorg utility is run for the table T1 and say Runstats is not run, and the statistics are not updated then even with a rebind, the Db2 Optimizer may continue to use the old access plan rather than a new one. Static SQL requires a rebind to possibly get a new access path, but dynamic SQL can take advantage of the new stats immediately. So, the Reorg may not give full benefit unless Runstats is run.
There are two primary ways in which Reorgs can help performance. Even without doing runstats and rebinding, if the reorg improves clustering by the clustering index, performance can improve; if there were many overflow rows, the reorg will clean them up and this will improve performance.
Result: No opportunity to possibly get a new access path and more improvement in the performance of the queries!
Scenario 2:
Reorg utility is run for the table T1 and also Runstats is run after that with Rebind then the statistics will be updated in the catalog tables also and the Db2 optimizer will possibly get an updated access plan.
This will only be the case if a new (and better) access path is chosen. If the old access path was good, the reorg and runstats may not change the access path and therefore the rebind will have no effect on performance.
Result: Queries performance have the best chance to improve, and the execution time will improve fast!
How to check the latest stats time for the table?
Db2 "select stats_time from syscat. tables where tabname='TABNAME'"
2. How do you map critical Routines to Critical statements? (Db2 LUW performance)
Routines play a vast critical role in the performance of the database.
When they are executing well within the bound range then the speed of the database also will be well framed.
But if the Routines start taking much time to execute then it is point of concern and the customer may come up with a issue saying it is taking too much time for the execution in the database.
When this scenario happens then the things to look for:
- Routine name and the text inside it
- Critical SQL’s inside the routine which are executing
Say if the CPU hog is happening because of the routine execution then the below query can be used to identify those culprit SQL’s:
select substr(routine_name,1,10) routine_name,
num_coord_exec,stmtno,executable_id,section_number,total_cpu_time,total_cpu_time/num_coord_exec as avg_cpu from table(mon_get_routine_exec_list(null,null,null,'routinename',-2));
Once the critical statements are found then dig deeper for the costs of those queries and why it is hogging CPU!
3. How to get the most expensive Routines in the database? (Db2 LUW performance)
For the database performance, functioning of the routines at the best level is one of the important things. The routines may be a stored procedure or a function which acts as a base for the core logic embedded in the database for the functionalities.
The critical routines at times consumes CPU for the transactions to perform. The logic inside that has lot of transactions embedded in it which hogs the CPU.
Say a customer is telling the system is consuming CPU and when we dig down it will boil down at times for the routines that are consuming CPU!
Identifying the most expensive routines by CPU consumption
The below SQL one liner can be used for the task:
SELECT ROUTINESCHEMA, ROUTINEMODULENAME, ROUTINENAME, SPECIFICNAME, SUM(TOTAL_CPU_TIME) AS TOTAL_CPU
FROM TABLE(MON_GET_ROUTINE(NULL,NULL,NULL,NULL,-2)) AS T
GROUP BY ROUTINESCHEMA, ROUTINEMODULENAME, ROUTINENAME, SPECIFICNAME ORDER BY TOTAL_CPU DESC;
Based on the output of the above query we can get to the inference these are the routines that are consuming CPU much and then go deeper into the SQL's embedded in the routine and again get the access plans for these SQL's based on Db2expln and Db2exfmt.
Once the expensive SQL's inside the routine are fixed then naturally Routine will consume less CPU and it improves the performance of the database.
4. How to drill down to exact execution time of the query? (Db2 LUW performance)
The typical place to find statement execution time is
mon_get_pkg_cache_stmt (). stmt_exec_time.
The fact is that stmt_exec_time is the sum of execution times for all agents /subagents running this query on this node. The more complex the statement, the more subagents tend to be involved, and the higher stmt_exec_time can get.
So the stmt_exec_time value will be longer for a query say 120s . But the actual execution time of the query will be around 30s.
So How to get the exact query execution time?
We can get a lot more meaningful data from coord_stmt_exec_time instead of stmt_exec_time. First, it’s only reported on the member where the coordinator agent ran, and second, it just gives us the elapsed execution time, start to finish,with no inclusion of subagent time, etc.
SQL one liner to get this job done:
select member, coord_stmt_exec_time, stmt_exec_time from table(mon_get_pkg_cache_stmt(null, null,null,-2));
Here we have to consider this field with total_act_time also . stmt_exec_time includes the time spent in this statement plus all of the statements or functions invoked by the statement.
An example would be if a UDF is used in the statement (e.g. SELECT MY_UDF(xxx) FROM T1......) , that time would be included in stmt_exec_time of your Select, but not in total_act_time.
If the UDF takes a long time, you would need both metrics to determine how much of the time was in your Select vs in the UDF.
5. Why do queries take long time to execute?(Db2 LUW performance)
SQL queries will take long time to execute for multiple reasons.
When the table is newly created and the data is populated with less volume, then the queries will be almost working perfectly fine.
But the real trouble creeps in , when the volume of the data is increasing in the table and also the database maintenance utilities are not running on a periodic basis.
Hence the queries start taking long time to execute.
How to reduce the query execution time?
- Check whether the proper indexes are created on the table and on the fields where the query clause is using(using syscat.indexes or describe indexes for table command)
- Check whether the regular Reorg and Runstats are run on the tables.(stats_time in syscat.tables and Reorgchk command)
- Check the parameter markers used in the queries(Db2expln or Db2exfmt command)
- Check whether the cost of the queries is more or less(Db2expln or Db2exfmt)
- Check the statement execution time from the package cache.(monreport.pkgcache())
If the query is executing long then there are lot of contributing factors for it. we have to dig down for it.
6. How do you find SQLS having most reads for the table?(Db2 LUW performance)
Rows Read and Rows written are the critical metrics for the performance of the queries. There will be lot of queries which does huge reads based on the transactions.
How can we find those culprit SQL's making most reads?
SELECT MEMBER,
EXECUTABLE_ID,
NUM_REFERENCES,
NUM_REF_WITH_METRICS,
ROWS_READ,
ROWS_INSERTED,
ROWS_UPDATED,
ROWS_DELETED
FROM TABLE(MON_GET_TABLE_USAGE_LIST(' ', ' ', -1))
ORDER BY ROWS_READ DESC
FETCH FIRST 10 ROWS ONLY;
For a batch job processing a large % of the table, it will need to have read a high number of rows. Therefore high rows read is not always a problem.
How to get the text of that SQL?
Get the Executable id from the above SQL and then pass into the MON_GET_PKG_CACHE_STMT table function.
SELECT STMT_TEXT
FROM TABLE
(MON_GET_PKG_CACHE_STMT(NULL,
x'01000000000000007C0000000000000000000000020020081126171720728997', NULL, -1))
7. How do you drill down which connection is holding the log? (Db2 LUW performance)
Logspace filling up is one of the critical information for the database performance issue. The more the logspace is held by the transactions it degrades the performance.
Normally there will be excessive long running transactions in the database system which causes the performance issues.The consequences of these long running transactions are :
- Contention due to locks held too long
- Active log space filling up
So if the problem is active log space filling up then the only area where you find the information related to the transaction which is holding the log space was Db2diag.log before Db2 V10.5
What’s the new way of finding after Db2V10.5?
DB2 provides a very critical monitoring element in the table function mon_get_transaction_log() called as applid_holding_oldest_xact .
So frame a query to select that
Db2 "select applid_holding_oldest_xact from table(mon_get_transaction_log(null))"
Say you get the value of 1568 then this is the culprit id which is holding the active log space.
Once it is identified then the option can be given whether to force off the application id or not. Based on the choice the task can be executed .
8. What is this cost information of SQL query? (Db2 LUW optimizer)
Whenever there is slowness in the query execution, then the first thing that comes into the mind is to check the cost of the SQL based on different utilities like Db2expln or Db2exfmt etc.
What is this cost estimation of the query?
Everything that the access plan does with data along the way has certain cost in terms of CPU cycles, input and output performed on memory or disk, or both. This cost is expressed in timerons.
The thumb Rule here to follow is:
The more the cost of the query is the lesser the performance of the query will be, and it contributes to the slowness.
For example in the access plan if the cost is more and then it is showing as full table scans. After investigation you have created a particular index to avoid that, then automatically the cost of the query comes down .
The reason for the cost to come down is index level scan than the table level scan!
Cost of the query is very important metric to be considered in deciding the performance of the query!
9. How can we avoid Db2 Lock escalation from row to table level? (Db2 LUW performance)
When a RR isolation level is used then we will get lot of row level locks and when it crosses a certain threshold depending on the MAXLOCKS and LOCKLIST parameter it escalates to table level lock. This process is called as Lock escalation. Even if the transaction does not reach the threshold for lock escalation, it will also result in more locks being held for a longer period of time. With RR, every row that is looked at by Db2 will get a read lock. In most cases you should be using CS isolation level. As the program goes through the cursor, Db2 releases the lock on the previous row.
Can we avoid this Scenario?
Yes we can avoid this scenario, Db2 provides a registry variable
DB2_AVOID_LOCK_ESCALATION.
When the DB2_AVOID_LOCK_ESCALATION registry variable is ON, lock escalation will not be performed.
Instead, SQL0912N is returned to the application that requested the lock that would normally result in lock escalation. The application is able to either COMMIT or ROLLBACK which will free the locks held by this application. This variable can be updated online without restarting the instance.
Db2 set DB2_AVOID_LOCK_ESCALATION = ON (Default is OFF)
Changes to this variable do not require the database instance to be restarted. This variable is applicable to all the operating systems!
10. What is the Influence of DBM configuration parameters on Db2 Optimizer? (Db2 LUW performance)
Database manager configuration parameters are those which influence the behaviour of the instance and also the databases that are running under it.
There are some critical database manager configuration parameters that influence the Db2 Optimizer!
- Parallelism (INTRA_PARALLEL):
Indicates whether intra-partition parallelism is enabled. When YES, some parts of query execution (e.g., index sort) can run in parallel within a single database partition. - CPU Speed (CPUSPEED):
The optimizer uses the CPU speed (milliseconds per instruction) to estimate the cost of performing certain operations.- Recommendation: do not adjust this parameter unless you are modelling a production environment on a test system or
assessing the impact of a hardware change
- Recommendation: do not adjust this parameter unless you are modelling a production environment on a test system or
- Communications speed (COMM_BANDWIDTH):
The optimizer uses the value (megabytes per second) in this parameter to estimate the cost overperforming certain operations between the database partition servers in a partitioned database environment.
11. How is the Bufferpool Accessed by the Application Request? (Db2 LUW internals)
For the better performance of the applications, it is always a good sign to use the bufferpools instead of disk for the processing so that it will be quicker as the pages are used in the bufferpool than the disk.
Now when the applications is accessing the bufferpools then what exactly happens internally at the Db2 luw architecture level?
Please find the flow of steps for it
- The application opens a connection to process a statement that is initially handled by the Coordinator Agent.
- The coordinator Agent will subdivide the tasks across the multiple sub agents and each subagent will do their tasks
- Depending on the application request, we may or may not execute logical I/O to check the buffer pools for pages.
- If the pages are found in the Buffer pool for processing of the request by the application then Logical I/O is executed which will happen in the buffer pool itself.
- If the pages are not found in the Buffer pool for processing then the application will process Physical I/O which will fetch the pages from the disk level to process.
- It then resides in the buffer pool until it is later victimized to make room for another page.
12. What is the Impact of Reorg and Runstats on the table? (Db2 LUW performance)
- It is always a good practice to Run Runstats before and after the Reorg and followed by Rebinds.
- We need Reorg when the fragmentation is more on the transaction table and the performance of the queries has gone slow.
- So, when we perform a Reorg on a table it closes those gaps and creates a contiguos set of data in the memory location.
What if Runstats is not performed after Reorg on table?
In this scenario, Reorg would have done its job. But to get the latest statistics of this rearrangement Runstats has to be run else Db2 optimizer never gets the input of the latest statistics!
So even after the Reorg operation on the table, Db2 Optimizer points to the old statistics which is not good for the performance of the queries!
What if Runstats is Run after Reorg on the table?
In this scenario , Both Reorg and Runstats has done their job successfully. Hence the Db2 optimizer will get the latest statistics updated after the Reorg on the table and which is very good for the performance of the queries!
Hence it is always a good practice to do a Runstats after the Reorg to notice the effect of the Reorg on the tables and the performance improvement of the queries!
13. Why Sort Heap size is vital for sort operations? (Db2 LUW performance)
Sorting is one of the critical operations performed by the database for a particular set of queries which needs that.
There is one database configuration parameter that can be used for this vital purpose called as SORTHEAP(Sort Heap Size).
This Sort Heap size is part of the database shared memory and defines the maximum number of memory pages allocated for the sorting operations that occurs in the database.
Why this is important for the Db2 Optimizer?
Because it helps the optimizer determine whether a sort will be performed in memory or be “spilled” to disk (i.e., to a system temporary table) due to running out of sort heap space.
A sort that occurs in memory versus a system temporary table always results in better performance and is used if possible.
14. What are the Tablespace parameters that influence Db2 optimizer? (Db2 LUW performance)
Prefetching pages means that one or more pages are retrieved from disk in the expectation that they will be required by an application.
Prefetching index and data pages into the buffer pool can help to improve performance by reducing I/O wait times.
In addition, parallel I/O enhances prefetching efficiency.
There are different parameters at the tablespace level that can influence the Db2 optimizer! The setting of these parameters matters a lot for the Db2 Optimizer to work at its best.
Also these parameters will come into the picture during the prefetchers and page cleaning phase of the Db2 LUW architecture as the tablespaces are involved at this time for the processing.
Tablespace parameters that influence Db2 Optimizer are:
- NUM_IOSERVERS:
Database configuration parameter that defines the number of prefetchers active in the database. The recommendation is set to AUTOMATIC
Db2 update db cfg for dbname using NUM_IOSERVERS automatic (Also it is default) - NUM_IOCLEANERS:
Database configuration parameter that defines the number of page cleaners active in the database. The recommendation is set to automatic
Db2 update db cfg for dbname using NUM_IOCLEANERS automatic (Also it is default) - PREFETCHSIZE:
This is a tablespace parameter which can be set to automatic while creating the tablespace before creating the tables inside it.
Tablespace parameter that defines the number of pages DB2 will try to read at a time, when prefetching. Recommendation is set to automatic. - EXTENTSIZE:
This is a tablespace parameter which can be set to automatic while creating the tablespace before creating the tables inside it.
Tablespace parameter that defines the number of pages in each unit of tablespace storage allocation. Recommendation is set to automatic.
15. Why do you need a Table Reorg? (Db2 LUW performance improvement)
As and when the data gets populated to the new table after the design at the initial requirement the volume of the data goes on increasing.
Lot of transactions will run on the table like select/insert/update/delete etc and the table gets fragmented at the disk level. Because of this lot of gaps gets created which will hinder the performance of the queries and in turn degrades the database performance.
Table Reorg plays a vital role in this situation.
What does a Table Reorg do?
- Reclaim Space:
Compact data onto fewer data pages
Return extents to the tablespace. This benefits in the faster table scans and the space can be used for other tables in pipeline. - Recluster:
Reorder rows into the same sequence as an index. This benefits in the faster index scans. - Remove Pointer & Overflows Records:
Convert pointer overflow pairs into a single normal record. This will help in faster row access - Recompress:
(Re-) builds compression dictionary with latest data, and (re-) compresses data. This will benefit in Reduced storage consumption and Reduced buffer pool consumption
Note: It is good to avoid usage of the Reorg for the not necessary tables! For that to identify run a reorgchk command first and identify what tables candidates are eligible for Reorg. Also always follow the cycle of (Runstats+Reorg+Runstats) so that Db2 Optimizer are fed with the latest statistics.
Also before using Reorgchk command, as it fetches the data from the catalog statistics then it is always a good practice to do a Runstats on the tables first. Otherwise Reorgchk will be reporting on old statistics.
16. What is this Dirty steal scenario and how we can limit it? (DB2 LUW performance)
The individual units of the database where the data resides are in pages! So the page cleaning and tuning become very vital for the database performance to be good and consistent.
The pages can be broadly classified into three states:
- Page is clean
- Page is in use
- Page is dirty (Dirty pages contain data that has been changed, but not written to the disk)
Page cleaning is an asynchronous process that is a big part of the huge power of Db2 buffer pools. When Db2 changes a page, the log file record is externalized to disk. This provides the “Durability” in ACID. However, the dirty page is still in the buffer pool.
If the percentage of the dirty pages is more in the bufferpool then that is hindrance to the bufferpool performance, in this scenario page cleaners gets triggered and the pages will be cleaned so that the pages can be used by the transactions.
When this dirty steal scenario gets formed?
- DB2 agent executes a transaction, needs to access a page
- Page is not in the BP (i.e. buffer pool miss), need to read the page
- Agent cannot find a vacant slot found in the BP to read-in a page!
Consequence: Transaction needs to wait for dirty pages to cleaned out synchronously , which takes a significant hit on the performance!
17. How can you limit the percentage of dirty pages? (Db2 LUW performance)
It is always a good practice to have less percentage of dirty pages in the buffer pools, so that the transactions get pages for the operations instead of waiting until the page cleaning occurs!
Db2 provides a db cfg parameter called chngpgs_thresh ("changed pages threshold) which helps in limiting the percentage of the dirty pages in the bufferpool
Page cleaners are triggered if percentage of dirty pages exceeds this value and the default value of chngpgs_thresh is 60.
- Lower value: more aggressive page cleaning (min: 5)
- Higher value: more delayed page cleaning (max: 99).
Hence use this parameter wisely to avoid huge percentage of dirty pages and hence avoiding the degradation of the performance of the database
18. Why it is critical to estimate Buffer pool performance? (Db2 LUW performance)
Bufferpool is part of database shared memory where the pages gets placed from the disk to the Bufferpool before the application processes these pages!
Buffer pool hit ratios are one of the most fundamental metrics, and give an important overall measure of how effectively the system is exploiting memory to avoid disk I/O.
Hit ratios of 80-85% or better for data and 90-95% or better for indexes are generally considered good for an OLTP environment, and of course these ratios can be calculated for individual buffer pools using data from the buffer pool snapshot.
Data pages
Data pages: ((pool_data_lbp_pages_found - pool_async_data_lbp_pages_found) /
(pool_data_l_reads + pool_temp_data_l_reads)) × 100
Index pages
Index pages: ((pool_index_lbp_pages_found - pool_async_index_lbp_pages_found )
/ (pool_index_l_reads + pool_temp_index_l_reads)) × 100
XML Storage object
XML storage object (XDA) pages: ((pool_xda_lbp_pages_found -
pool_async_xda_lbp_pages_found ) / (pool_xda_l_reads + pool_temp_xda_l_reads)) × 100
If the Buferpool is healthy and getting utilized well then there is less need of I/O, means hitting the disk to get the data is reduced which is very good sign for the performance of the applications.
On the other face of the coin , if the Bufferpool is not healthy and the hit ratios are poor then applications will process from the disks which deters the performance as it uses disks instead of memory which is not a good sign for the database performance!
19. How does prefetching data using I/O servers will happen?(Db2 LUW performance)
Prefetching pages means that one or more pages are retrieved from disk in the expectation that they will be required by an application.
Prefetching index and data pages into the buffer pool can help to improve performance by reducing I/O wait times.
In addition, parallel I/O enhances prefetching efficiency.
I/O servers are used to prefetch data into a buffer pool.
Flow of the steps involved:
- The user application passes the request to the database agent that has been assigned to the user.
- The database agent determines that prefetching should be used to obtain the data that is required to satisfy the request, and writes a prefetch request to the I/O server queue application by the database manager.
- The first available I/O server reads the prefetch request from the queue and then reads the data from the table space into the buffer pool.
- The number of I/O servers that can simultaneously fetch data
from a table space depends on the number of prefetch requests in the queue and the number of I/O servers specified by the num_ioservers database configuration parameter. - The database agent performs the necessary operations on the data pages in the buffer pool and returns the result to the user application.