思维导图
07系列文章
Oracle优化07-分析及动态采样-直方图
Oracle优化07-分析及动态采样-DBMS_STATS 包
Oracle优化07-分析及动态采样-动态采样
DBMS_STATS包
DBMS_STAS 包不仅能够对表进行分析,它还可以对数据库分析进行管理。
按照功能可以分一下几类:
( 1) 性能数据的收集
( 2) 性能数据的设置
( 3) 性能数据的删除
( 4) 性能数据的备份和恢
11G 官方文档:
http://docs.oracle.com/cd/E11882_01/appdev.112/e40758/d_stats.htm#ARPLS059
性能数据的收集包含这样几个存储过程:
- GATHER_DATABASE_STATS Procedures
- GATHER_DICTIONARY_STATS Procedure
- GATHER_FIXED_OBJECTS_STATS Procedure
- GATHER_INDEX_STATS Procedure
- GATHER_SCHEMA_STATS Procedures
- GATHER_SYSTEM_STATS Procedures
- GATHER_TABLE_STATS Procedure
从名字也可以看出各自的作用,这些存储过程用来收集数据库不同级别对象的性能数据,包括:数据库,数据字典,表,索引,SCHEMA 的性能等。
GATHER_TABLE_STATS 收集信息功能
我们分析时最常用到的就是GATHER_TABLE_STATS 了。
在11gR2版本中dbms_stats包下的 gather_table_stats存过定义如下
我们可以在注释中看到 This procedure gathers table and column (and index) statistics.
procedure gather_table_stats (ownname varchar2,tabname varchar2,partname varchar2 default null,estimate_percent number default DEFAULT_ESTIMATE_PERCENT,block_sample boolean default FALSE,method_opt varchar2 default DEFAULT_METHOD_OPT,degree number default to_degree_type(get_param('DEGREE')),granularity varchar2 default DEFAULT_GRANULARITY,cascade boolean default DEFAULT_CASCADE,stattab varchar2 default null,statid varchar2 default null,statown varchar2 default null,no_invalidate boolean default to_no_invalidate_type(get_param('NO_INVALIDATE')),stattype varchar2 default 'DATA',force boolean default FALSE,-- the context is intended for internal use only. context dbms_stats.CContext default null); -- -- This procedure gathers table and column (and index) statistics. -- It attempts to parallelize as much of the work as possible,but there -- are some restrictions as described in the individual parameters. -- This operation will not parallelize if the user does not have select -- privilege on the table being analyzed. -- -- Input arguments: -- ownname - schema of table to analyze -- tabname - name of table -- partname - name of partition -- estimate_percent - Percentage of rows to estimate (NULL means compute). -- The valid range is [0.000001,100]. Use the constant -- DBMS_STATS.AUTO_SAMPLE_SIZE to have Oracle determine the -- appropriate sample size for good statistics. This is the default. -- The default value can be changed using set_param procedure. -- block_sample - whether or not to use random block sampling instead of -- random row sampling. Random block sampling is more efficient,but -- if the data is not randomly distributed on disk then the sample values -- may be somewhat correlated. Only pertinent when doing an estimate -- statistics. -- method_opt - method options of the following format -- -- method_opt := FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause] -- FOR COLUMNS [size_clause] -- column|attribute [size_clause] -- [,column|attribute [size_clause] ... ] -- -- size_clause := SIZE [integer | auto | skewonly | repeat],-- where integer is between 1 and 254 -- -- column := column name | extension name | extension -- -- default is FOR ALL COLUMNS SIZE AUTO. -- The default value can be changed using set_param procedure. -- Optimizer related table statistics are always gathered. -- -- If an extension is provided,the procedure create the extension if it -- does not exist already. Please refer to create_extended_stats for -- description of extension. -- -- degree - degree of parallelism (NULL means use of table default value -- which was specified by DEGREE clause in CREATE/ALTER TABLE statement) -- Use the constant DBMS_STATS.DEFAULT_DEGREE for the default value -- based on the initialization parameters. -- default for degree is NULL. -- The default value can be changed using set_param procedure. -- granularity - the granularity of statistics to collect (only pertinent -- if the table is partitioned) -- 'AUTO' - the procedure determines what level of statistics to collect -- 'GLOBAL AND PARTITION' - gather global- and partition-level statistics -- 'APPROX_GLOBAL AND PARTITION' - This option is similar to -- 'GLOBAL AND PARTITION'. But the global statistics are aggregated -- from partition level statistics. It will aggregate all statistics except number of -- distinct values for columns and number of distinct keys of indexes. -- The existing histograms of the columns at the table level -- are also aggregated.The global statistics are gathered -- (i.e.,going back to GLOBAL AND PARTITION behaviour) -- if partname argument is null. The aggregation will use only -- partitions with statistics,so to get accurate global statistics,-- user has to make sure to have statistics for all partitions. -- -- -- This option is useful when you collect statistics for a new partition added -- into a range partitioned table (for example,a table partitioned by month). -- The new data in the partition makes the global statistics stale (especially -- the min/max values of the partitioning column). This stale global statistics -- may cause suboptimal plans. In this scenario,users can collect statistics -- for the newly added partition with 'APPROX_GLOBAL AND PARTITION' -- option so that the global statistics will reflect the newly added range. -- This option will take less time than 'GLOBAL AND PARTITION' option since the -- global statistics are aggregated from underlying partition level statistics. -- Note that,if you are using APPROX_GLOBAL AND PARTITION,-- you still need to collect global statistics (with granularity = 'GLOBAL' option) -- when there is substantial amount of change at the table level. -- For example you added 10% more data to the table. This is needed to get the -- correct number of distinct values/keys statistic at table level. -- 'SUBPARTITION' - gather subpartition-level statistics -- 'PARTITION' - gather partition-level statistics -- 'GLOBAL' - gather global statistics -- 'ALL' - gather all (subpartition,partition,and global) statistics -- default for granularity is AUTO. -- The default value can be changed using set_param procedure. -- cascade - gather statistics on the indexes for this table. -- Use the constant DBMS_STATS.AUTO_CASCADE to have Oracle determine -- whether index stats to be collected or not. This is the default. -- The default value can be changed using set_param procedure. -- Using this option is equivalent to running the gather_index_stats -- procedure on each of the table's indexes. -- stattab - The user stat table identifier describing where to save -- the current statistics. -- statid - The (optional) identifier to associate with these statistics -- within stattab. -- statown - The schema containing stattab (if different then ownname) -- no_invalidate - Do not invalide the dependent cursors if set to TRUE. -- The procedure invalidates the dependent cursors immediately -- if set to FALSE. -- Use DBMS_STATS.AUTO_INVALIDATE to have oracle decide when to -- invalidate dependend cursors. This is the default. The default -- can be changed using set_param procedure. -- When the 'cascade' argument is specified,not pertinent with certain -- types of indexes described in the gather_index_stats section. -- force - gather statistics of table even if it is locked. -- context - internal use only. -- -- Exceptions: -- ORA-20000: Table does not exist or insufficient privileges -- ORA-20001: Bad input value -- ORA-20002: Bad user statistics table,may need to upgrade it -- ORA-20005: object statistics are locked --
gather_table_STATS使用
在 gather_table_stats 存储过程的所有参数中,除了 ownname 和 tabname,其他的参数都有默认值。
所以我们在调用这个存储过程时, Oracle 会使用参数的默认值对表进行分析。
如:
SQL> exec dbms_stats.gather_table_STATS(user,'T'); PL/SQL procedure successfully completed
oracel就会对当前用户下的T表按照所有参数的默认值进行分析,其中user是一个变量,用来返回当前的用户信息。
当然你也可以指定用户名,比如:
SQL> exec dbms_stats.gather_table_STATS('cc','xgj'); PL/SQL procedure successfully completed
对cc用户下的xgj表进行分析。
查看gather_table_STATS参数当前的默认值
如果想查看当前的默认值,可以使用 dbms_stats.get_param 函数来获取:
比如查看method_opt的默认值:
SQL> select dbms_stats.get_param('method_opt') from dual;
参数说明
参数的说明:
estimate_percent 参数
这个参数是一个百分比值,它告诉分析包需要使用表中数据的多大比例来做分析。
理论上来讲,采样的数据越多,得到的信息就越接近于实际, CBO做出的执行计划就越优化,但是,采样越多,消耗的系统资源必然越多。 对系统的影响也越大。
所以对于这个值的设置,要根据业务情况来。
如果数据的直方图分布比较均匀,就可以使用默认值: AUTO_SAMPLE_SIZE,即让 Oracle 自己来判断采样的比例。
有时,特别是对于批量加载的表,我们可以预估表中的数据量,可以人工地设置一个合理的值。 一般,对于一个有 1000 万数据的表分区,可以把这个参数设置为 0.000001.
Method_option 参数