The paper is an overview of the DTA (Database Tuning Advisor) which was release in 2005. The tool is able to recommend indexes, materialized views and partitioning (usually ranged and aligned - the same partitions are present on tables and indexes). There were several automated physical design tools developed by commercial database vendors, however, the DTA tool seemed to be the most advanced one. For example, it offered fully integrated recommendation, where, for instance, indexes and materialized views were considered simultaneously and not in a one-by-one fashion or so called in the staging manner. These tools consider not only the single objective of improving a general performance of a given workload, but also take into account scalability (the tuning time should be reasonable and do not impose too much overhead on a production server), manageability (the usage of horizontal range partitioning to ensure easy backup/restore, to add new data or remove old data), and functionality (the features provided by the DTA - e.g., scriptability - an XML schema for input/output that enhances scriptability and cusomizability of the tool; and efficient tuning in production/test server scenarios).
Back then, one of the main concern was the scalability. This was addressed in DTA as well by enabling it to scale to large databases and workloads via:
The key word is simulation - do not copy any data for tuning or do not create new physical designs for testing the performance but imitate the structures, data, use sampling effectively and leverage the “what-if” type of analysis as well as query optimizer plans and costing.
The workload is a set of SQL statements that execute against the database server. A workload can be obtained by using SQL Server Profiler, a tool for logging events that execute on a server. In the current version of DTA, the Query Store can be used to collect the information about the workload for the tuning purposes (and it retains the information between server restarts): How To Tune a Workload from Query Store in dta.exe command line Utility?
If we are limited to only k indexes, but our tuning recommends many more, than we have to narrow down the selection. It can be done by first choosing a small number \(m\) of indexes, this is the best possible configuration of m indexes. Then, we add a single index at a time, choosing the one that decrease the overall workload cost the most, in a greedy fashion, until we reach the configuration with the required \(k\) indexes.
Copy only the metadata about the databases from the production to the test server. If some statistics for an index are missing, then create them on the production server (this is the biggest overhead). The main idea is to simulate as much as possible and reduce the load on the production server by running the tuning on the test server. Moreover, the production environment can be simulate on the test server, by specifying a specific hardware (number of CPUs, memory available) for the tuning on the test server.
Overall, the experiments are comprehensive. It is worth mentioning that the DTA does a better job than some DBAs. For example, for workload from one of the customers, the hand-tuned design was worse than a baseline (only indexes to enforce primary/uniquer keys) due to presence of updates. For this workload, DTA correctly recommended no additional physical design structures. For the TPC-H benchmark with 10GB of the input data, the size of the raw data in database was 12.8 GB. After the tuning, the expected performance improvement based on the costing from the query optimizer should increase by 88% but the decrease in the total workload execution time was 83%. The total storage space alloted was three times the raw data size. The compression of the workload (based on the templates and the signatures) provided even 40X improvement in tuning time for workloads with many query templates with a small degradation in the quality (0.5% or 1%). However, for the workload without any templates, such as TPC-H, the reduction in the tuning time was 1% (with, of course, no decrease in the quality - perhaps the number of extracted templates was equal to the total number of queries in the TPC-H benchmark, which is 22).