Determining The Query Optimizer In Use

When GPORCA is enabled, you can determine if HAWQ is using GPORCA or is falling back to the legacy query optimizer.

These are two ways to determine which query optimizer HAWQ used to execute the query:

  • Examine EXPLAIN query plan output for the query. (Your output may include other settings.)

    • When GPORCA generates the query plan, the GPORCA version is displayed near the end of the query plan . For example.

       Settings:  optimizer=on
       Optimizer status:  PQO version 1.627
      

      When HAWQ falls back to the legacy optimizer to generate the plan, legacy query optimizer is displayed near the end of the query plan. For example.

       Settings:  optimizer=on
       Optimizer status: legacy query optimizer
      

      When the server configuration parameter OPTIMIZER is off, the following lines are displayed near the end of a query plan.

       Settings:  optimizer=off
       Optimizer status: legacy query optimizer
      
    • These plan items appear only in the EXPLAIN plan output generated by GPORCA. The items are not supported in a legacy optimizer query plan.

      • Assert operator
      • Sequence operator
      • DynamicIndexScan
      • DynamicTableScan
      • Table Scan
    • When a query against a partitioned table is generated by GPORCA, the EXPLAIN plan displays only the number of partitions that are being eliminated is listed. The scanned partitions are not shown. The EXPLAIN plan generated by the legacy optimizer lists the scanned partitions.

  • View the log messages in the HAWQ log file.

    The log file contains messages that indicate which query optimizer was used. In the log file message, the [OPT] flag appears when GPORCA attempts to optimize a query. If HAWQ falls back to the legacy optimizer, an error message is added to the log file, indicating the unsupported feature. Also, in the message, the label Planner produced plan: appears before the query when HAWQ falls back to the legacy optimizer.

    Note: You can configure HAWQ to display log messages on the psql command line by setting the HAWQ server configuration parameter client_min_messages to LOG. See Server Configuration Parameter Reference for information about the parameter.

Example

This example shows the differences for a query that is run against partitioned tables when GPORCA is enabled.

This CREATE TABLE statement creates a table with single level partitions:

CREATE TABLE sales (trans_id int, date date, 
    amount decimal(9,2), region text)
   DISTRIBUTED BY (trans_id)
   PARTITION BY RANGE (date)
      (START (date '2011­01­01') 
       INCLUSIVE END (date '2012­01­01') 
       EXCLUSIVE EVERY (INTERVAL '1 month'),
   DEFAULT PARTITION outlying_dates);

This query against the table is supported by GPORCA and does not generate errors in the log file:

SELECT * FROM sales;

The EXPLAIN plan output lists only the number of selected partitions.

 ->  Partition Selector for sales (dynamic scan id: 1)  (cost=10.00..100.00 rows=50 width=4)
       Partitions selected:  13 (out of 13)

Output from the log file indicates that GPORCA attempted to optimize the query:

2015-05-06 15:00:53.293451 PDT,"gpadmin","test",p2809,th297883424,"[local]",
  ,2015-05-06 14:59:21 PDT,1120,con6,cmd1,seg-1,,dx3,x1120,sx1,"LOG","00000"
  ,"statement: explain select * from sales
;",,,,,,"explain select * from sales
;",0,,"postgres.c",1566,

2015-05-06 15:00:54.258412 PDT,"gpadmin","test",p2809,th297883424,"[local]",
  ,2015-05-06 14:59:21 PDT,1120,con6,cmd1,seg-1,,dx3,x1120,sx1,"LOG","00000","
[OPT]: Using default search strategy",,,,,,"explain select * from sales
;",0,,"COptTasks.cpp",677,

The following cube query is not supported by GPORCA.

SELECT count(*) FROM foo GROUP BY cube(a,b);

The following EXPLAIN plan output includes the message “Feature not supported by GPORCA.”

postgres=# EXPLAIN SELECT count(*) FROM foo GROUP BY cube(a,b);
LOG:  statement: explain select count(*) from foo group by cube(a,b);
LOG:  2016-04-14 16:26:15:487935 PDT,THD000,NOTICE,"Feature not supported by the GPORCA: Cube",
LOG:  Planner produced plan :0
                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice3; segments: 3)  (cost=9643.62..19400.26 rows=40897 width=28)
   ->  Append  (cost=9643.62..19400.26 rows=13633 width=28)
         ->  HashAggregate  (cost=9643.62..9993.39 rows=9328 width=28)
               Group By: "rollup".unnamed_attr_2, "rollup".unnamed_attr_1, "rollup"."grouping", "rollup"."group_id"
               ->  Subquery Scan "rollup"  (cost=8018.50..9589.81 rows=1435 width=28)
                     ->  Redistribute Motion 3:3  (slice1; segments: 3)  (cost=8018.50..9546.76 rows=1435 width=28)
                           Hash Key: "rollup".unnamed_attr_2, "rollup".unnamed_attr_1, "grouping", group_id()
                           ->  GroupAggregate  (cost=8018.50..9460.66 rows=1435 width=28)
                                 Group By: "rollup"."grouping", "rollup"."group_id"
                                 ->  Subquery Scan "rollup"  (cost=8018.50..9326.13 rows=2153 width=28)
                                       ->  GroupAggregate  (cost=8018.50..9261.56 rows=2153 width=28)
                                             Group By: "rollup".unnamed_attr_2, "rollup"."grouping", "rollup"."group_id"
                                             ->  Subquery Scan "rollup"  (cost=8018.50..9073.22 rows=2870 width=28)
                                                   ->  GroupAggregate  (cost=8018.50..8987.12 rows=2870 width=28)
                                                         Group By: public.foo.b, public.foo.a
                                                         ->  Sort  (cost=8018.50..8233.75 rows=28700 width=8)
                                                               Sort Key: public.foo.b, public.foo.a
                                                               ->  Seq Scan on foo  (cost=0.00..961.00 rows=28700 width=8)
         ->  HashAggregate  (cost=9116.27..9277.71 rows=4305 width=28)
               Group By: "rollup".unnamed_attr_1, "rollup".unnamed_attr_2, "rollup"."grouping", "rollup"."group_id"
               ->  Subquery Scan "rollup"  (cost=8018.50..9062.46 rows=1435 width=28)
                     ->  Redistribute Motion 3:3  (slice2; segments: 3)  (cost=8018.50..9019.41 rows=1435 width=28)
                           Hash Key: public.foo.a, public.foo.b, "grouping", group_id()
                           ->  GroupAggregate  (cost=8018.50..8933.31 rows=1435 width=28)
                                 Group By: public.foo.a
                                 ->  Sort  (cost=8018.50..8233.75 rows=28700 width=8)
                                       Sort Key: public.foo.a
                                       ->  Seq Scan on foo  (cost=0.00..961.00 rows=28700 width=8)
 Settings:  optimizer=on
 Optimizer status: legacy query optimizer
(30 rows)

Since this query is not supported by GPORCA, HAWQ falls back to the legacy optimizer.