First, still even today the tools to analyze query performance in databases are not perfect. Let’s look at this comment from this website: https://goo.gl/6xxnS6:
A whole recommendation is not very helpful. It is often suggested to create N indices to achieve an X% improvement. Of these N indexes, almost always 1 or 2 are the indices most contributing, others weigh the database work without great gain. I suggest adding a column where we can see the contribution of each index to the X% improvement. To do this today, it takes a long time, because you need it reevaluate each suggested index one by one.
There was an effort in 1998 to provide a comprehensive visualisations of: workload, configurations and cost and index usage statistics.
Users are provided with a specialized interface to query the hypthetical designs, however, in principle, they can write the SQL queries on their own against the analysis data tables.
An interseting side note is a short description of a sampling strategy (described in detail in another paper), but explained enough and worth mentioning here.
An adaptive page-level sampling algorithm is used to efficiently gather statistical measures relevant to query optimization. Incrementally, sample of size \(\sqrt(n) = m\) is drawn, where \(n\) is the number of pages in the table. The samples are stores in a sample-table, in sorted order. The statistical measures contain, for example, equi-depth histograms. After the first sample is drawn and the statistics gathered, we draw another sample and verify if the new sample has “similar” statistical characteristics as the previously sampled data stored in sample-table (for now, the table contains just a single sample of size m). To be precise, we check if the values in the new sample are divided approximately in equal numbers in each bin of the equi-depth histogram. If the statistics are in a certain ballpark similar, we finish the algorithm. Otherwise, the last sample is merged into the sample-table and the statistics are updated. We repeat the process until the aggreement in statistics between the sample-table and a newly drawn sample is reached, all we drawn some specified number of samples.
The index anslysis utility is indespensible - and this is what this paper addresses. Analyze the impact of current or hyptothetical indexes on the performance of your workload (set of queries). This approach is based on relative estimation of the cost that enables a large class of analysis at low cost. The authors implemented a tool that can use hypothetical indexes to predict workload cost, this was achieved by using sampling based techniques.
Thus, the most useful command provided is: Esimtate Configuration (set of indexes) of