Defragmentation Can Degrade Query Performance

Datetime:2016-08-23 02:15:29          Topic: SQL  Cluster Analysis           Share

As data in tables is updated and / or deleted, pockets of empty space can be created, leading to fragmented free space. This free space is wasted space as it is insufficient to hold new data. Moreover, it can impact performance of the database as a higher number of sparsely-populated blocks need to be visited. Oracle 11g introduced online segment shrink functionality to defragment and reclaim this space. During this operation, the database:

  • Compacts the segment: Fragmented free space is consolidated by moving the rows to new locations to create empty blocks near the High Water Mark (HWM – a measurement of the maximum number of database blocks a segment has used so far).
  • Adjusts the high water mark so that new free space is available above the HWM. That free space is then deallocated and is made available for use by other segments.
  • Maintains the indexes so that they remain usable after the operation is complete.

Shrinking a sparsely-populated segment improves the performance of Full Table Scans because there are fewer blocks below the HWM. But performance of queries doing Index scans might degrade since the clustering factor of the index can increase as a result of row movement in the table. It is a misconception that rebuilding of index can improve its clustering factor. To improve the index clustering factor, data in the table needs to be reorganized so that rows in the table are in the same order on disk as the index keys.

In this article, I will demonstrate:

  • Compacting of a fragmented table / index results in:
    • Movement rows across data blocks
    • Increased clustering factor of the index
  • Shrinking of a table causes:
    • Adjustment of HWM of table
    • Release of free space above the HWM
  • Shrinking of the index coalesces the space freed on deleting the rows
  • Increased clustering factor of the index causes degradation of queries performing Index Full Scan
  • Rebuilding of the index:
    • Does not affect its clustering factor
    • Improves performance of index access
  • Reorganizing the data in table improves:
    • The clustering factor of the index
    • Performance of table access via the index

Demonstration

  • Create and populate a table called organized, with 400 distinct ID s and 7 rows per ID .
SQL> drop table hr.organized purge; 
     create table hr.organized (id number, txt char(900));

           begin
           for i in 1..400 loop
               insert into hr.organized select i, lpad('x', 900, 'x')
               from    dba_objects where rownum < 8;
           end loop;
          end;
          /
  • Verify that all seven rows for each ID are located in the same block and that each block contains records belonging to one ID only.
SQL>select distinct id, dbms_rowid.ROWID_BLOCK_NUMBER(rowid), count(*)
     from hr.organized
     group by id, dbms_rowid.ROWID_BLOCK_NUMBER(rowid)
     order by id;

        ID DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)   COUNT(*)
---------- ------------------------------------ ----------
         1                                 5262          7
         2                                 5263          7
         3                                 5259          7
         4                                 5260          7
         5                                 5261          7
         .
         .
       395                                10365          7
       396                                10306          7
       397                                10310          7
       398                                10314          7
       399                                10318          7
       400                                10322          7

400 rows selected.
  • Create an index on the ID column and gather statistics for the table and index.
SQL> exec dbms_stats.gather_table_stats('HR', 'organized', estimate_percent => 100, method_opt=> 'for all indexed columns size 254', cascade => true);
  • Find out the HWM of the table:
SQL> select TABLE_NAME, blocks "Ever Used", empty_blocks "Never Used",
            num_rows "Total rows"
      from dba_tables 
      where table_name='ORGANIZED';

TABLE_NAME                      Ever Used Never Used Total rows
------------------------------ ---------- ---------- ----------
ORGANIZED                             496          0       2800
  • From index statistics verify that:
    • Clustering factor = Number of distinct IDs (400), as all the records for an ID are placed in the same block. Hence while accessing all the records of the table via index, blocks have to be switched 400 times.
    • Information about 2800 rows having 400 distinct keys is spread across 6 leaf blocks.
SQL>SELECT Blevel, LEAF_BLOCKS, DISTINCT_KEYS, NUM_ROWS, CLUSTERING_FACTOR
FROM DBA_INDEXES
WHERE index_name = 'ORGANIZED_IDX';

     BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS CLUSTERING_FACTOR
---------- ----------- ------------- ---------- -----------------
         1           6           400       2800               400
  • Verify that there are no deleted rows in leaf blocks:
SQL> Analyze index hr.organized_idx validate structure;

     select name, blocks, lf_blks, LF_ROWS, del_lf_rows, 
            (del_lf_rows_len/lf_rows_len)*100 as wastage
     from index_stats
     WHERE name = 'ORGANIZED_IDX';

NAME                BLOCKS    LF_BLKS    LF_ROWS DEL_LF_ROWS    WASTAGE
--------------- ---------- ---------- ---------- ----------- ----------
ORGANIZED_IDX           16          6       2800           0          0
  • Delete one row for every ID to introduce fragmentation:
SQL>    begin
           for i in 1..400 loop
               delete from hr.organized where rowid = 
                    (select min(rowid) from hr.organized where id = i);
           end loop;
          end;
          /
  • Verify that after one record for every ID has been deleted, each block now contains 6 records for each key, thereby leaving free space for one record.
SQL> select count(*) from hr.organized;

  COUNT(*)
----------
      2400

SQL> select distinct id, dbms_rowid.ROWID_BLOCK_NUMBER(rowid), count(*)
     from hr.organized
     group by id, dbms_rowid.ROWID_BLOCK_NUMBER(rowid)
     order by id;
 
        ID DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)   COUNT(*)
---------- ------------------------------------ ----------
         1                                 5262          6
         2                                 5263          6
         3                                 5259          6
         4                                 5260          6
         5                                 5261          6
         .
         .
       395                                10365          6
       396                                10306          6
       397                                10310          6
       398                                10314          6
       399                                10318          6
       400                                10322          6

400 rows selected.
  • Gather statistics for the table and index:
SQL> exec dbms_stats.gather_table_stats('HR', 'organized', estimate_percent => 100, method_opt=> 'for all indexed columns size 254', cascade => true);
  • Verify that the HWM for the table remains unchanged:
SQL> select TABLE_NAME, blocks "Ever Used", empty_blocks "Never Used",
            num_rows "Total rows"
      from dba_tables 
      where table_name='ORGANIZED';

TABLE_NAME                      Ever Used Never Used Total rows
------------------------------ ---------- ---------- ----------
ORGANIZED                             496          0       2400
  • From index statistics verify that:
    • Clustering factor (= number of distinct keys (400)) remains same as earlier as all 6 rows for an ID are still placed in the same block. Hence while accessing all the records of the table via the index, blocks have to switch 400 times.
    • Information about 2400 (instead of 2800 earlier) rows having 400 distinct keys is spread across 6 leaf blocks.
SQL>SELECT index_name, status, Blevel, LEAF_BLOCKS, DISTINCT_KEYS, NUM_ROWS, 
           CLUSTERING_FACTOR
    FROM DBA_INDEXES WHERE index_name = 'ORGANIZED_IDX';

INDEX_NAME     STATUS       BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS CLUSTERING_FACTOR
------------- -------- ---------- ----------- ------------- ---------- -----------------
ORGANIZED_IDX   VALID             1           6           400       2400              400
  • Analyze the index and verify that:
    • There are 400 deleted rows in leaf blocks leading to around 14% space wastage.
    • There are 2800 leaf rows as earlier (400 deleted rows are not reflected as they are still occupying space).
SQL> Analyze index hr.organized_idx validate structure;

    select name, blocks, lf_blks, LF_ROWS, del_lf_rows, 
           (del_lf_rows_len/lf_rows_len)*100 as wastage
     from index_stats
    WHERE name = 'ORGANIZED_IDX';

NAME                BLOCKS    LF_BLKS    LF_ROWS DEL_LF_ROWS    WASTAGE
--------------- ---------- ---------- ---------- ----------- ----------
ORGANIZED_IDX           16          6       2800         400 14.2857143
  • Execute a query which performs Index Full Scan. Note that:
    • There is a cost of 7 for using the index for the ORGANIZED table, i.e. the query will hit one root block (1) and the 6 leaf blocks (6).
    • The query will be doing 400 more IOs against the table (equal to the clustering factor), because the rows needed are all next to each other on 400 database blocks.
    • Total cost of query = Cost of Index access (7) + Cost of Table access (400) = 407.
SQL>set autotrace traceonly explain
          select /*+ index(o organized_idx) */ count(txt)
          from  hr.organized o where id=id;
          set autotrace off
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2296712572

----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |     1 |   905 |   407   (0)| 00:00:05 |
|   1 |  SORT AGGREGATE              |               |     1 |   905 |           |           |
|   2 |   TABLE ACCESS BY INDEX ROWID| ORGANIZED     |  2400 |  2121K|   407   (0)| 00:00:05 |
|*  3 |    INDEX FULL SCAN           | ORGANIZED_IDX |  2400 |       |     7   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("ID" IS NOT NULL)
  • Let’s compact only the table and index by specifying the SHRINK SPACE COMPACT clause.
SQL> alter table hr.organized enable row movement
     alter table hr.organized shrink space compact cascade;
  • Let us see how the data has been reorganized in the table after compaction. It can be seen that the rows have been moved in order to fill up the vacant space for one record in each block, so that each block now contains 6 records for one ID and one record of another ID.
SQL>select distinct dbms_rowid.ROWID_BLOCK_NUMBER(rowid) Block_no, id,count(*)
     from hr.organized
     group by id, dbms_rowid.ROWID_BLOCK_NUMBER(rowid)
     order by dbms_rowid.ROWID_BLOCK_NUMBER(rowid);
 
  BLOCK_NO         ID   COUNT(*)
---------- ---------- ----------
      5259          3          6
      5259        395          1
      5260          4          6
      5260        395          1
      5261          5          6
      5261        395          1
            .
            .
     10209        263          1
     10210        263          1
     10210        278          6
     10211        263          1
     10211        294          6
     10212        247          6

685 rows selected.
  • As a result, records for 57 IDs have been scattered over 6 blocks, with each block having one row of that ID whereas all the 6 rows of the remaining 343 IDs are contained within the same block.
SQL> select count(*) Num_Ids,  org.cnt spread_across_blocks 
         from
            (select id, count(distinct(dbms_rowid.ROWID_BLOCK_NUMBER(rowid))) cnt
              from hr.organized
              group by id)org 
        group by org.cnt;   

   NUM_IDS SPREAD_ACROSS_BLOCKS
---------- --------------------
       343                    1
        57                    6
  • Gather statistics for the table and index:
SQL> exec dbms_stats.gather_table_stats('HR', 'organized', estimate_percent => 100, -method_opt=> 'for all indexed columns size 254', cascade => true);
  • Verify that HWM for the table remains the same, since only compaction has taken place:
SQL> select TABLE_NAME, blocks "Ever Used", empty_blocks "Never Used",
            num_rows "Total rows"
      from dba_tables 
      where table_name='ORGANIZED';

TABLE_NAME                      Ever Used Never Used Total rows
------------------------------ ---------- ---------- ----------
ORGANIZED                             496          0       2400
  • From index statistics, verify that:
    • Clustering factor has increased from 400 to 685
      • All 6 rows for each of the 343 IDs are in the same block – 343 table block switches are needed to access 343 IDs
      • Rows for each of the 57 IDs are spread across 6 blocks, i.e. 57*6 = 342 table block switches are needed to access these 57 IDs
      • To access all the 400 IDs total block switches = 343 + 342 = 685
    • Index has been maintained (status = valid)
SQL> SELECT INDEX_NAME, Status, Blevel, LEAF_BLOCKS, DISTINCT_KEYS, NUM_ROWS, 
            CLUSTERING_FACTOR
    FROM DBA_INDEXES
    WHERE index_name = 'ORGANIZED_IDX';

INDEX_NAME      STATUS       BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS CLUSTERING_FACTOR
--------------- -------- ---------- ----------- ------------- ---------- -----------------
ORGANIZED_IDX   VALID             1           6           400       2400              685
  • Analyze the index and verify that after compacting the index, the space freed on deleting the rows has been coalesced and entries for deleted rows have been removed:
    • DEL_LF_ROWS = 0 (400 earlier)
    • LF_ROWS = 2400 (2800 earlier)
    • WASTAGE = 0 (14% earlier)
SQL> Analyze index hr.organized_idx validate structure;

     col name for a15
    select name, blocks, lf_blks, LF_ROWS, del_lf_rows, (del_lf_rows_len/lf_rows_len)*100 as wastage
     from index_stats
    WHERE name = 'ORGANIZED_IDX';


NAME                BLOCKS    LF_BLKS    LF_ROWS DEL_LF_ROWS    WASTAGE
--------------- ---------- ---------- ---------- ----------- ----------
ORGANIZED_IDX           16          6       2400           0          0
  • Let us execute the query again and check execution statistics. Note that:
    • IOs to index remain same as earlier, i.e. 7
    • IOs to table have increased from 400 to 685 (equaling the clustering factor of the index)
    • Total cost has increased from 407 to 692
SQL>set autotrace traceonly explain
          select /*+ index(o organized_idx) */ count(txt)
          from  hr.organized o where id=id;
          set autotrace off

Execution Plan
----------------------------------------------------------
Plan hash value: 2296712572

----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |     1 |   905 |   692   (0)| 00:00:09 |
|   1 |  SORT AGGREGATE              |               |     1 |   905 |           |           |
|   2 |   TABLE ACCESS BY INDEX ROWID| ORGANIZED     |  2400 |  2121K|   692   (0)| 00:00:09 |
|*  3 |    INDEX FULL SCAN           | ORGANIZED_IDX |  2400 |       |     7   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("ID" IS NOT NULL)
  • Hence it can be deduced that compacting the table and index:
    • Does not move the HWM of the table
    • Coalesces the space freed by deleted rows in the index
    • Can increase CF of the index
    • Can increase the cost of accessing the table via the index
  • Now let’s shrink the table so that the HWM of the table moves from 496 blocks to 343 blocks:
SQL>alter table hr.organized shrink space cascade;

SQL> exec dbms_stats.gather_table_stats('HR', 'organized', estimate_percent => 100, -method_opt=> 'for all indexed columns size 254', cascade => true);

SQL> select TABLE_NAME, blocks "Ever Used", empty_blocks "Never Used",
            num_rows "Total rows"
      from dba_tables 
      where table_name='ORGANIZED';

TABLE_NAME                      Ever Used Never Used Total rows
------------------------------ ---------- ---------- ----------
ORGANIZED                             343          0       2400
  • Note that index status remains valid and that there is no change to clustering factor or other index statistics:
SQL>SELECT INDEX_NAME, Status, Blevel, LEAF_BLOCKS, DISTINCT_KEYS, NUM_ROWS, 
           CLUSTERING_FACTOR
    FROM DBA_INDEXES
    WHERE index_name = 'ORGANIZED_IDX';

INDEX_NAME      STATUS       BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS CLUSTERING_FACTOR
--------------- -------- ---------- ----------- ------------- ---------- -----------------
ORGANIZED_IDX   VALID             1           6           400       2400              685
SQL> Analyze index hr.organized_idx validate structure;
     select name, blocks, lf_blks, LF_ROWS, del_lf_rows, 
            (del_lf_rows_len/lf_rows_len)*100 as wastage
     from index_stats
     WHERE name = 'ORGANIZED_IDX';

NAME                BLOCKS    LF_BLKS    LF_ROWS DEL_LF_ROWS    WASTAGE
--------------- ---------- ---------- ---------- ----------- ----------
ORGANIZED_IDX           16          6       2400           0          0
  • Let’s execute the query again and check execution statistics. Note that:
    • IOs to index remain the same as earlier, i.e. 7
    • IOs to table remain the same as earlier, i.e. 685 (= clustering factor of the index)
    • Total cost remain the same as earlier, i.e. 692
SQL>set autotrace traceonly explain
          select /*+ index(o organized_idx) */ count(txt)
          from  hr.organized o where id=id;
          set autotrace off
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2296712572

----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)| Time      
|
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |     1 |   905 |   692   (0)| 00:00:09 |
|   1 |  SORT AGGREGATE              |               |     1 |   905 |           |       |
|   2 |   TABLE ACCESS BY INDEX ROWID| ORGANIZED     |  2400 |  2121K|   692   (0)| 00:00:09 |
|*  3 |    INDEX FULL SCAN           | ORGANIZED_IDX |  2400 |       |     7   (0)| 00:00:01  ----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("ID" IS NOT NULL)
  • Let’s rebuild the index and verify that :
    • Clustering factor remains same as earlier, i.e. 685
    • Number of leaf blocks have dropped from 6 to 5
SQL>Alter index hr.organized_idx rebuild;

SQL> exec dbms_stats.gather_index_stats('HR', 'organized_idx');

SQL> col index_name for a15
     SELECT INDEX_NAME, Status, Blevel, LEAF_BLOCKS, DISTINCT_KEYS, NUM_ROWS,
            CLUSTERING_FACTOR
    FROM DBA_INDEXES
    WHERE index_name = 'ORGANIZED_IDX';

INDEX_NAME      STATUS       BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS CLUSTERING_FACTOR
--------------- -------- ---------- ----------- ------------- ---------- -----------------
ORGANIZED_IDX   VALID             1           5           400       2400              685

SQL> Analyze index hr.organized_idx validate structure;

     select name, blocks, lf_blks, LF_ROWS, del_lf_rows, 
            (del_lf_rows_len/lf_rows_len)*100 as wastage
     from index_stats
     WHERE name = 'ORGANIZED_IDX';

NAME                BLOCKS    LF_BLKS    LF_ROWS DEL_LF_ROWS    WASTAGE
--------------- ---------- ---------- ---------- ----------- ----------
ORGANIZED_IDX           16          5       2400           0          0
  • Let’s execute the query again and check execution statistics. Note that:
    • IOs to index have dropped from 7 to 6 as the number of leaf blocks has decreased by 1
    • IOs to table remain same as earlier, i.e. 685 (= clustering factor of the index)
    • Total cost drops by 1, i.e. 692 to 691
SQL>set autotrace traceonly explain
          select /*+ index(o organized_idx) */ count(txt)
          from  hr.organized o where id=id;
          set autotrace off
 

Execution Plan
----------------------------------------------------------
Plan hash value: 2296712572

----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)| Time|
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |     1 |   905 |   691   (0)| 00:00:09 |
|   1 |  SORT AGGREGATE              |               |     1 |   905 |           |            |
|   2 |   TABLE ACCESS BY INDEX ROWID| ORGANIZED     |  2400 |  2121K|   691   (0)| 00:00:09 |
|*  3 |    INDEX FULL SCAN           | ORGANIZED_IDX |  2400 |       |     6   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("ID" IS NOT NULL)

Hence we can say that rebuilding the index:

  • Does not alter the clustering factor of the index.
  • May decrease the number of leaf blocks in the index, thereby reducing the cost of index access.
  • In order to improve clustering factor of the index and hence cost of table access, let us sort the data in the in the same order on disk as the index keys. Note that it causes clustering factor for the index to improve from 685 to 343.
SQL>create table hr.torganized as select * from hr.organized order by id;
    drop table hr.organized purge;
    conn hr/hr
    alter table torganized rename to organized;
    create index hr.organized_idx on hr.organized(id);

    exec dbms_stats.gather_table_stats('HR', 'organized', estimate_percent => 100, method_opt=> 'for all indexed columns size 254', cascade => true);

     SELECT INDEX_NAME, Status, Blevel, LEAF_BLOCKS, DISTINCT_KEYS, NUM_ROWS,
            CLUSTERING_FACTOR
    FROM USER_INDEXES
    WHERE index_name = 'ORGANIZED_IDX';

INDEX_NAME      STATUS       BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS CLUSTERING_FACTOR
--------------- -------- ---------- ----------- ------------- ---------- -----------------
ORGANIZED_IDX   VALID             1           5           400       2400              343
  • Let’s execute the query again and check execution statistics. Note that:
    • IOs to index remain the same as earlier, i.e. 6
    • IOs to table drop from 685 to 343 (equalling improved clustering factor of the index)
    • Total cost significantly drops from 692 to 349
SQL>set autotrace traceonly explain
          select /*+ index(o organized_idx) */ count(txt)
          from  hr.organized o where id=id;
          set autotrace off
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2296712572

----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)| Time|
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |     1 |   905 |   349   (0)| 00:00:05|
|   1 |  SORT AGGREGATE              |               |     1 |   905 |            |         |
|   2 |   TABLE ACCESS BY INDEX ROWID| ORGANIZED     |  2400 |  2121K|   349   (0)| 00:00:05|
|*  3 |    INDEX FULL SCAN           | ORGANIZED_IDX |  2400 |       |     6   (0)| 00:00:01|
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("ID" IS NOT NULL)

Summary

  1. Compacting of a fragmented table / index:
    • Does not move the HWM of the table
    • Causes deleted rows to be removed from the index
    • Can increase CF of the index and hence the cost of accessing the table via the index may increase
  2. Shrinking of a table causes:
    • Adjustment of the HWM of table
    • Release of free space above the HWM
  3. Shrinking of an index coalesces the space freed upon deleting the rows
  4. Rebuilding of index:
    • Does not alter its clustering factor
    • May decrease the number of leaf blocks in the index thereby reducing cost of index access.
  5. After shrinking the table and indexes, to take advantage in terms of performance, you should:
    • Rebuild the indexes
    • Sort the data in the table in the order of the index key to improve the index clustering factor.

References

http://www.oracle.com/technetwork/issue-archive/2012/12-sep/o52asktom-1735913.html

https://richardfoote.wordpress.com/2011/09/25/rebuilding-indexes-and-the-clustering-factor-solution-move-on/

Tags:,,

,





About List