Not surprisingly, one of the questions I get asked most often asked by customers is how much CPU could they be saving–and how do they pinpoint the not-so-obvious sources of inefficiencies that might be using excess CPU.
I tell them for DB2 applications, executing RUNSTATS is a good place to start.
RUNSTATS is a DB2 utility that captures detailed information about the DB2 objects, tables, indexes, etc. and stores that information in the DB2 catalog. DB2 uses this data when trying to determine the best access path for getting the data requested by the SQL. But if RUNSTATS has not been executed or not executed in a long time, as maybe the case, the Catalog data is probably out of date and the access path chosen will not be efficient. The end result is that DB2 spends more CPU to get the data then it would have otherwise needed to. Thus, a source of excess CPU. Using iStrobe, another DB2 tool or querying the DB2 catalog can tell you when the last time RUNSTATS was executed.
But in all my years of using Strobe and iStrobe, I was never able to actually quantify the effect of updating the Catalog Statistics with RUNSTATS. We have had customers in the past do it, letting me know the job now runs much quicker and save CPU in the process, but we’ve never been able to see a before-and-after Strobe profile to quantify the savings.
As part of an internal project, I was developing a DB2 Stored Procedure for a customer demo application to showcase our Application Performance Management (APM) tools. I created the tables, indexes, SQL statements and uploaded the data. I never thought about RUNSTATS. In fact, I wanted really poorly performing SQL statements. Strobe pointed this out to me when I first measured the application.
From the Strobe data I noticed (see A, above) DB2 was performing a tablespace scan to get the data, which was not my intention. Why was it sequentially going after the data? Was there no index available? As you see above, there is an INDEX available DTDBA.HISTORY_IDX (see B) and the SQL statement contains a predicate using the first column of the index CUSTOMERID (see B). The Strobe Advisor is suggesting that a RUNSTATS and a REORG may be beneficial in this case (see C). What is the result of not doing the RUNSTATS and REORG? The SQL statement executed 6,096 times (see D) with a combined elapsed time of 325 seconds and CPU time of 81 seconds. How much would a REORG and RUNTSTATS save the application?
If there is an INDEX and DB2 knows about it, it is in the catalog. Why doesn’t it use it to acquire the data it needs more efficiently? Scanning through 90,000 to 160,000 records is much more resource-intensive then using the INDEX to get the few records it needs. Unfortunately DB2 does not know how many rows are in the table. The DB2 optimizer, the magic black box that determines or tries to determine the most efficient access path, may be thinking that a tablespace scan will get the data quicker than using an available INDEX. DB2 will do this sometimes and for small tables it can actually be more efficient to do a tablespace scan. For large tables, however, the impact to performance can be crippling, not to mention the excessive CPU used.
After running the JCL that performed a REORG and RUNSTATS on the tablespace (C), I re-bound the affected PACKAGE. This is the most important step; not re-binding the affected package will not force DB2 to re-evaluate the access path.
I re-measured the application after the changes and now my results are much more in line with what I would hope for this application.
You can see above that the SQL Statement executed 16,344 times, had a total elapsed time of 1.08 seconds, a reduction of 324 seconds (for 10,000 more executions!) and used .5 CPU second, a savings of over 80 CPU seconds. Both Strobe measurements were for 10 min and each collected 10,000 samples. Because the SQL executes so much faster, the throughput is much greater in the second profile.
Will you have the same results if you find SQL with no or out-of-date Catalog Statistics? Maybe yes, maybe no. Your “mileage will vary” as they say, but this gave me an idea of the power of telling DB2 what the data looks like. And for the first time I was able to quantify RUNSTATS value. The more DB2 knows about your data, the better choices it can make when picking an access path. To help you identify these situations, the Strobe Advisor and the iStrobe SQL Analysis Feature data will point out pretty clearly when RUNSTATS should be executed.