Configuration Parameters

Descriptions of the HAWQ server configuration parameters listed alphabetically.

add_missing_from
application_name
array_nulls
authentication_timeout
backslash_quote
block_size
bonjour_name
check_function_bodies
client_encoding
client_min_messages
cpu_index_tuple_cost
cpu_operator_cost
cpu_tuple_cost
cursor_tuple_fraction
custom_variable_classes
DateStyle
db_user_namespace
deadlock_timeout
debug_assertions
debug_pretty_print
debug_print_parse
debug_print_plan
debug_print_prelim_plan
debug_print_rewritten
debug_print_slice_table
default_hash_table_bucket_number
default_statement_mem
default_statistics_target
default_tablespace
default_transaction_isolation
default_transaction_read_only
dynamic_library_path
effective_cache_size
enable_bitmapscan
enable_groupagg
enable_hashagg
enable_hashjoin
enable_indexscan
enable_mergejoin
enable_nestloop
enable_secure_filesystem
enable_seqscan
enable_sort
enable_tidscan
escape_string_warning
explain_memory_verbosity
explain_pretty_print
extra_float_digits
from_collapse_limit
gp_adjust_selectivity_for_outerjoins
gp_analyze_relative_error
gp_autostats_mode
gp_autostats_on_change_threshold
gp_backup_directIO
gp_backup_directIO_read_chunk_mb
gp_cached_segworkers_threshold
gp_command_count
gp_connections_per_thread
gp_debug_linger
gp_dynamic_partition_pruning
gp_enable_agg_distinct
gp_enable_agg_distinct_pruning
gp_enable_direct_dispatch
gp_enable_fallback_plan
gp_enable_fast_sri
gp_enable_groupext_distinct_gather
gp_enable_groupext_distinct_pruning
gp_enable_multiphase_agg
gp_enable_predicate_propagation
gp_enable_preunique
gp_enable_sequential_window_plans
gp_enable_sort_distinct
gp_enable_sort_limit
gp_external_enable_exec
gp_external_grant_privileges
gp_external_max_segs
gp_filerep_tcp_keepalives_count
gp_filerep_tcp_keepalives_idle
gp_filerep_tcp_keepalives_interval
gp_hashjoin_tuples_per_bucket
gp_idf_deduplicate
gp_interconnect_cache_future_packets
gp_interconnect_default_rtt
gp_interconnect_fc_method
gp_interconnect_hash_multiplier
gp_interconnect_min_retries_before_timeout
gp_interconnect_min_rto
gp_interconnect_queue_depth
gp_interconnect_setup_timeout
gp_interconnect_snd_queue_depth
gp_interconnect_timer_period
gp_interconnect_timer_checking_period
gp_interconnect_type
gp_log_format
gp_max_csv_line_length
gp_max_databases
gp_max_filespaces
gp_max_packet_size
gp_max_plan_size
gp_max_tablespaces
gp_motion_cost_per_row
gp_reject_percent_threshold
gp_reraise_signal
gp_role
gp_safefswritesize
gp_segment_connect_timeout
gp_segments_for_planner
gp_session_id
gp_set_proc_affinity
gp_set_read_only
gp_statistics_pullup_from_child_partition
gp_statistics_use_fkeys
gp_vmem_idle_resource_timeout
gp_vmem_protect_segworker_cache_limit
gp_workfile_checksumming
gp_workfile_compress_algorithm
gp_workfile_limit_files_per_query
gp_workfile_limit_per_query
gp_workfile_limit_per_segment
hawq_acl_type
hawq_dfs_url
hawq_global_rm_type
hawq_master_address_host
hawq_master_address_port
hawq_master_directory
hawq_master_temp_directory
hawq_re_memory_overcommit_max
hawq_rm_cluster_report_period
hawq_rm_force_alterqueue_cancel_queued_request
hawq_rm_master_port
hawq_rm_memory_limit_perseg
hawq_rm_min_resource_perseg
hawq_rm_nresqueue_limit
hawq_rm_nslice_perseg_limit
hawq_rm_nvcore_limit_perseg
hawq_rm_nvseg_perquery_limit
hawq_rm_nvseg_perquery_perseg_limit
hawq_rm_nvseg_variance_amon_seg_limit
hawq_rm_rejectrequest_nseg_limit
hawq_rm_resource_idle_timeout
hawq_rm_return_percent_on_overcommit
hawq_rm_segment_heartbeat_interval
hawq_rm_segment_port
hawq_rm_stmt_nvseg
hawq_rm_stmt_vseg_memory
hawq_rm_tolerate_nseg_limit
hawq_rm_yarn_address
hawq_rm_yarn_app_name
hawq_rm_yarn_queue_name
hawq_rm_yarn_scheduler_address
hawq_rps_address_port
hawq_rps_check_local_interval
hawq_segment_address_port
hawq_segment_directory
hawq_segment_temp_directory
integer_datetimes
IntervalStyle
join_collapse_limit
krb_caseins_users
krb_server_keyfile
krb_srvname
lc_collate
lc_ctype
lc_messages
lc_monetary
lc_numeric
lc_time
listen_addresses
local_preload_libraries
log_autostats
log_connections
log_disconnections
log_dispatch_stats
log_duration
log_error_verbosity
log_executor_stats
log_hostname
log_min_duration_statement
log_min_error_statement
log_min_messages
log_parser_stats
log_planner_stats
log_rotation_age
log_rotation_size
log_statement
log_statement_stats
log_timezone
log_truncate_on_rotation
maintenance_work_mem
max_appendonly_tables
max_connections
max_files_per_process
max_fsm_pages
max_fsm_relations
max_function_args
max_identifier_length
max_index_keys
max_locks_per_transaction
max_stack_depth
optimizer
optimizer_analyze_root_partition
optimizer_minidump
optimizer_parts_to_force_sort_on_insert
optimizer_prefer_scalar_dqa_multistage_agg
password_encryption
password_hash_algorithm
pgstat_track_activity_query_size
pljava_classpath
pljava_release_lingering_savepoints
pljava_statement_cache_size
pljava_vmoptions
port
pxf_enable_filter_pushdown
pxf_enable_stat_collection
pxf_remote_service_login
pxf_remote_service_secret
pxf_service_address
pxf_service_port
pxf_stat_max_fragments
random_page_cost
regex_flavor
runaway_detector_activation_percent
search_path
seg_max_connections
seq_page_cost
server_encoding
server_ticket_renew_interval
server_version
server_version_num
shared_buffers
shared_preload_libraries
ssl
ssl_ciphers
standard_conforming_strings
statement_timeout
superuser_reserved_connections
tcp_keepalives_count
tcp_keepalives_idle
tcp_keepalives_interval
temp_buffers
TimeZone
timezone_abbreviations
track_activities
track_counts
transaction_isolation
transaction_read_only
transform_null_equals
unix_socket_directory
unix_socket_group
unix_socket_permissions
update_process_title
vacuum_cost_delay
vacuum_cost_limit
vacuum_cost_page_dirty
vacuum_cost_page_miss
vacuum_freeze_min_age
xid_stop_limit

add_missing_from

Automatically adds missing table references to FROM clauses. Present for compatibility with releases of PostgreSQL prior to 8.1, where this behavior was allowed by default.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

application_name

Sets the application name for a client session. For example, if connecting via psql, this will be set to psql. Setting an application name allows it to be reported in log messages and statistics views.

Value Range	Default	Set Classifications
string		master, session, reload

array_nulls

This controls whether the array input parser recognizes unquoted NULL as specifying a null array element. By default, this is on, allowing array values containing null values to be entered.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

authentication_timeout

Maximum time to complete client authentication. This prevents hung clients from occupying a connection indefinitely.

Value Range	Default	Set Classifications
Any valid time expression (number and unit)	1min	local, system, restart

backslash_quote

This controls whether a quote mark can be represented by \' in a string literal. The preferred, SQL-standard way to represent a quote mark is by doubling it ('') but PostgreSQL has historically also accepted \'. However, use of \' creates security risks because in some client character set encodings, there are multibyte characters in which the last byte is numerically equivalent to ASCII \.

Value Range Default Set Classifications

Value Range	Default	Set Classifications
on (allow `\'` always) off (reject always) safe_encoding (allow only if client encoding does not allow ASCII `\` within a multibyte character)	safe_encoding	master, session, reload

on (allow \' always)

off (reject always)

safe_encoding (allow only if client encoding does not allow ASCII \ within a multibyte character)

safe_encoding

master, session, reload

block_size

Reports the size of a disk block.

Value Range	Default	Set Classifications
number of bytes	32768	read only

bonjour_name

Specifies the Bonjour broadcast name. By default, the computer name is used, specified as an empty string. This option is ignored if the server was not compiled with Bonjour support.

Value Range	Default	Set Classifications
string	unset	master, system, restart

check_function_bodies

When set to off, disables validation of the function body string during CREATE FUNCTION. Disabling validation is occasionally useful to avoid problems such as forward references when restoring function definitions from a dump.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

client_encoding

Sets the client-side encoding (character set). The default is to use the same as the database encoding. See Supported Character Sets in the PostgreSQL documentation.

Value Range	Default	Set Classifications
character set	UTF8	master, session, reload

client_min_messages

Controls which message levels are sent to the client. Each level includes all the levels that follow it. The later the level, the fewer messages are sent.

Value Range	Default	Set Classifications
DEBUG5 DEBUG4 DEBUG3 DEBUG2 DEBUG1 LOG NOTICE WARNING ERROR FATAL PANIC	NOTICE	master, session, reload

DEBUG5

DEBUG4

DEBUG3

DEBUG2

DEBUG1

LOG NOTICE

WARNING

ERROR

FATAL

PANIC

NOTICE

master, session, reload

cpu_index_tuple_cost

For the legacy query optimizer (planner), sets the estimate of the cost of processing each index row during an index scan. This is measured as a fraction of the cost of a sequential page fetch.

Value Range	Default	Set Classifications
floating point	0.005	master, session, reload

cpu_operator_cost

For the legacy query optimizer (planner), sets the estimate of the cost of processing each operator in a WHERE clause. This is measured as a fraction of the cost of a sequential page fetch.

Value Range	Default	Set Classifications
floating point	0.0025	master, session, reload

cpu_tuple_cost

For the legacy query optimizer (planner), Sets the estimate of the cost of processing each row during a query. This is measured as a fraction of the cost of a sequential page fetch.

Value Range	Default	Set Classifications
floating point	0.01	master, session, reload

cursor_tuple_fraction

Tells the legacy query optimizer (planner) how many rows are expected to be fetched in a cursor query, thereby allowing the legacy optimizer to use this information to optimize the query plan. The default of 1 means all rows will be fetched.

Value Range	Default	Set Classifications
integer	1	master, session, reload

custom_variable_classes

Specifies one or several class names to be used for custom variables. A custom variable is a variable not normally known to the server but used by some add-on modules. Such variables must have names consisting of a class name, a dot, and a variable name.

Value Range	Default	Set Classifications
comma-separated list of class names	unset	local, system, restart

DateStyle

Sets the display format for date and time values, as well as the rules for interpreting ambiguous date input values. This variable contains two independent components: the output format specification and the input/output specification for year/month/day ordering.

Value Range	Default	Set Classifications
<format>, <date style> where: <format> is ISO, Postgres, SQL, or German <date style> is DMY, MDY, or YMD	ISO, MDY	master, session, reload

where:

<format> is ISO, Postgres, SQL, or German

<date style> is DMY, MDY, or YMD

ISO, MDY

master, session, reload

db_user_namespace

This enables per-database user names. If on, you should create users as username@dbname. To create ordinary global users, simply append @ when specifying the user name in the client.

Value Range	Default	Set Classifications
Boolean	off	local, system, restart

deadlock_timeout

The time, in milliseconds, to wait on a lock before checking to see if there is a deadlock condition. On a heavily loaded server you might want to raise this value. Ideally the setting should exceed your typical transaction time, so as to improve the odds that a lock will be released before the waiter decides to check for deadlock.

Value Range	Default	Set Classifications
integer (milliseconds)	1000	local, system, restart

debug_assertions

Turns on various assertion checks.

Value Range	Default	Set Classifications
Boolean	off	local, system, restart

debug_pretty_print

Indents debug output to produce a more readable but much longer output format. client_min_messages or log_min_messages must be DEBUG1 or lower.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

debug_print_parse

For each executed query, prints the resulting parse tree. client_min_messages or log_min_messages must be DEBUG1 or lower.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

debug_print_plan

For each executed query, prints the HAWQ parallel query execution plan. client_min_messages or log_min_messages must be DEBUG1 or lower.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

debug_print_prelim_plan

For each executed query, prints the preliminary query plan. client_min_messages or log_min_messages must be DEBUG1 or lower.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

debug_print_rewritten

For each executed query, prints the query rewriter output. client_min_messages or log_min_messages must be DEBUG1 or lower.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

debug_print_slice_table

For each executed query, prints the HAWQ query slice plan. client_min_messages or log_min_messages must be DEBUG1 or lower.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

default_hash_table_bucket_number

The default number of hash buckets to use when executing a query statement on a hash table. Due to dynamic allocation, when the query is actually executed, the number of virtual segments may differ from this number depending on the query’s needs. The total number of segments should never exceed the maximum set in hawq_rm_nvseg_perquery_limit.

When expanding the cluster, you should adjust this number to reflect the number of nodes in the new cluster times the number of virtual segments per node. See Expanding a Cluster and Creating and Managing Tables for more details on modifying this parameter.

Value Range	Default	Set Classifications
integer > 0	6*SegmentCount	master, session, reload

default_statement_mem

The default amount of memory, in KB, to allocate to query statements that do not require any segment resources and are executed only on the master host. This type of query execution is rare in HAWQ.

The default value of this configuration parameter is acceptable for most deployments. Modify this value only if you are using an advanced configuration.

Value Range	Default	Set Classifications
integer > 1000	128000	master, session, reload

default_statistics_target

Sets the default statistics target for table columns that have not had a column-specific target set via ALTER TABLE SET STATISTICS. Larger values increase the time needed to do ANALYZE, but may improve the quality of the legacy query optimizer (planner) estimates.

Value Range	Default	Set Classifications
integer > 0	25	master, session, reload

default_tablespace

The default tablespace in which to create objects (tables and indexes) when a CREATE command does not explicitly specify a tablespace.

Value Range	Default	Set Classifications
name of a tablespace	unset	master, session, reload

default_transaction_isolation

Controls the default isolation level of each new transaction.

Value Range	Default	Set Classifications
read committed read uncommitted repeatable read serializable	read committed	master, session, reload

read committed

read uncommitted

repeatable read

serializable

read committed

master, session, reload

default_transaction_read_only

Controls the default read-only status of each new transaction. A read-only SQL transaction cannot alter non-temporary tables.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

dynamic_library_path

If a dynamically loadable module needs to be opened and the file name specified in the CREATE FUNCTION or LOAD command does not have a directory component (i.e. the name does not contain a slash), the system will search this path for the required file. The compiled-in PostgreSQL package library directory is substituted for $libdir. This is where the modules provided by the standard PostgreSQL distribution are installed.

Value Range	Default	Set Classifications
a list of absolute directory paths separated by colons	$libdir	master, session, reload

effective_cache_size

Sets the assumption about the effective size of the disk cache that is available to a single query for the legacy query optimizer (planner). This is factored into estimates of the cost of using an index; a higher value makes it more likely index scans will be used, a lower value makes it more likely sequential scans will be used. This parameter has no effect on the size of shared memory allocated by a HAWQ server instance, nor does it reserve kernel disk cache; it is used only for estimation purposes.

Value Range	Default	Set Classifications
floating point	512MB	master, session, reload

enable_bitmapscan

Enables or disables the use of bitmap-scan plan types by the legacy query optimizer (planner). Note that this is different than a Bitmap Index Scan. A Bitmap Scan means that indexes will be dynamically converted to bitmaps in memory when appropriate, giving faster index performance on complex queries against very large tables. It is used when there are multiple predicates on different indexed columns. Each bitmap per column can be compared to create a final list of selected tuples.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

enable_groupagg

Enables or disables the use of group aggregation plan types by the legacy query optimizer (planner).

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

enable_hashagg

Enables or disables the use of hash aggregation plan types by the legacy query optimizer (planner).

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

enable_hashjoin

Enables or disables the use of hash-join plan types by the legacy query optimizer (planner).

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

enable_indexscan

Enables or disables the use of index-scan plan types by the legacy query optimizer (planner).

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

enable_mergejoin

Enables or disables the use of merge-join plan types by the legacy query optimizer (planner). Merge join is based on the idea of sorting the left- and right-hand tables into order and then scanning them in parallel. So, both data types must be capable of being fully ordered, and the join operator must be one that can only succeed for pairs of values that fall at the ‘same place’ in the sort order. In practice this means that the join operator must behave like equality.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

enable_nestloop

Enables or disables the use of nested-loop join plans by the legacy query optimizer (planner). It’s not possible to suppress nested-loop joins entirely, but turning this variable off discourages the legacy optimizer from using one if there are other methods available.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

enable_secure_filesystem

Enables or disables access to a secure HDFS file system. To enable Kerberos security for HDFS, set this configuration parameter to on before starting HAWQ.

Value Range	Default	Set Classifications
Boolean	off	master, system, superuser

enable_seqscan

Enables or disables the use of sequential scan plan types by the legacy query optimizer (planner). It’s not possible to suppress sequential scans entirely, but turning this variable off discourages the legacy optimizer from using one if there are other methods available.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

enable_sort

Enables or disables the use of explicit sort steps by the legacy query optimizer (planner). It’s not possible to suppress explicit sorts entirely, but turning this variable off discourages the legacy optimizer from using one if there are other methods available.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

enable_tidscan

Enables or disables the use of tuple identifier (TID) scan plan types by the legacy query optimizer (planner).

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

escape_string_warning

When on, a warning is issued if a backslash (\) appears in an ordinary string literal (’…’ syntax). Escape string syntax (E’…’) should be used for escapes, because in future versions, ordinary strings will have the SQL standard-conforming behavior of treating backslashes literally.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

explain_memory_verbosity

Controls the granularity of memory information displayed in EXPLAIN ANALYZE output. explain_memory_verbosity takes three values:

SUPPRESS - generate only total memory information for the whole query
SUMMARY - generate basic memory information for each executor node
DETAIL - generate detailed memory information for each executor node

Value Range	Default	Set Classifications
SUPPRESS SUMMARY DETAIL	SUPPRESS	master

SUPPRESS

SUMMARY

DETAIL

SUPPRESS

master

explain_pretty_print

Determines whether EXPLAIN VERBOSE uses the indented or non-indented format for displaying detailed query-tree dumps.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

extra_float_digits

Adjusts the number of digits displayed for floating-point values, including float4, float8, and geometric data types. The parameter value is added to the standard number of digits. The value can be set as high as 2, to include partially-significant digits; this is especially useful for dumping float data that needs to be restored exactly. Or it can be set negative to suppress unwanted digits.

Value Range	Default	Set Classifications
integer	0	master, session, reload

from_collapse_limit

The legacy query optimizer (planner) will merge sub-queries into upper queries if the resulting FROM list would have no more than this many items. Smaller values reduce planning time but may yield inferior query plans.

Value Range	Default	Set Classifications
integer (1-n)	20	master, session, reload

gp_adjust_selectivity_for_outerjoins

Enables the selectivity of NULL tests over outer joins.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_analyze_relative_error

Sets the estimated acceptable error in the cardinality of the table “ a value of 0.5 is supposed to be equivalent to an acceptable error of 50% (this is the default value used in PostgreSQL). If the statistics collected during ANALYZE are not producing good estimates of cardinality for a particular table attribute, decreasing the relative error fraction (accepting less error) tells the system to sample more rows.

Value Range	Default	Set Classifications
floating point < 1.0	0.25	master, session, reload

gp_autostats_mode

Specifies the mode for triggering automatic statistics collection with ANALYZE. The on_no_stats option triggers statistics collection for CREATE TABLE AS SELECT, INSERT, or COPY operations on any table that has no existing statistics.

Warning: Depending on the specific nature of your database operations, automatic statistics collection can have a negative performance impact. Carefully evaluate whether the default value of on_no_stats is appropriate for your system.

The on_change option triggers statistics collection only when the number of rows affected exceeds the threshold defined by gp_autostats_on_change_threshold. Operations that can trigger automatic statistics collection with on_change are:

CREATE TABLE AS SELECT

INSERT

COPY

Default is on_no_stats.

Note: For partitioned tables, automatic statistics collection is not triggered if data is inserted from the top-level parent table of a partitioned table. Automatic statistics collection is triggered if data is inserted directly in a leaf table (where the data is stored) of the partitioned table. Statistics are collected only on the leaf table.

none

on_change

on_no_stats

on_no_ stats

master, session, reload

gp_autostats_on_change_threshold

Specifies the threshold for automatic statistics collection when gp_autostats_mode is set to on_change. When a triggering table operation affects a number of rows exceeding this threshold, ANALYZEis added and statistics are collected for the table.

Value Range	Default	Set Classifications
integer	2147483647	master, session, reload

gp_backup_directIO

Direct I/O allows HAWQ to bypass the buffering of memory within the file system cache for backup. When Direct I/O is used for a file, data is transferred directly from the disk to the application buffer, without the use of the file buffer cache.

Direct I/O is supported only on Red Hat Enterprise Linux, CentOS, and SUSE.

Value Range	Default	Set Classifications
on, off	off	master, session, reload

gp_backup_directIO_read_chunk_mb

Sets the chunk size in MB when Direct I/O is enabled with gp_backup_directIO. The default chunk size is 20MB.

The default value is the optimal setting. Decreasing it will increase the backup time and increasing it will result in little change to backup time.

Value Range	Default	Set Classifications
1-200	20 MB	local, session, reload

gp_cached_segworkers_threshold

When a user starts a session with HAWQ and issues a query, the system creates groups or 'gangs’ of worker processes on each segment to do the work. After the work is done, the segment worker processes are destroyed except for a cached number which is set by this parameter. A lower setting conserves system resources on the segment hosts, but a higher setting may improve performance for power-users that want to issue many complex queries in a row.

Value Range	Default	Set Classifications
integer > 0	5	master, session, reload

gp_command_count

Shows how many commands the master has received from the client. Note that a single SQL command might actually involve more than one command internally, so the counter may increment by more than one for a single query. This counter also is shared by all of the segment processes working on the command.

Value Range	Default	Set Classifications
integer > 0	1	read only

gp_connections_per_thread

A value larger than or equal to the number of segments means that each slice in a query plan will get its own thread when dispatching to the segments. A value of 0 indicates that the dispatcher should use a single thread when dispatching all query plan slices to a segment. Lower values will use more threads, which utilizes more resources on the master. Typically, the default does not need to be changed unless there is a known throughput performance problem.

Value Range	Default	Set Classifications
integer	64	master, session, reload

gp_debug_linger

Number of seconds for a HAWQ process to linger after a fatal internal error.

Value Range	Default	Set Classifications
Any valid time expression (number and unit)	0	master, session, reload

gp_dynamic_partition_pruning

Enables plans that can dynamically eliminate the scanning of partitions.

Value Range	Default	Set Classifications
on/off	on	master, session, reload

gp_enable_agg_distinct

Enables or disables two-phase aggregation to compute a single distinct-qualified aggregate. This applies only to subqueries that include a single distinct-qualified aggregate function.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_enable_agg_distinct_pruning

Enables or disables three-phase aggregation and join to compute distinct-qualified aggregates. This applies only to subqueries that include one or more distinct-qualified aggregate functions.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_enable_direct_dispatch

Enables or disables the dispatching of targeted query plans for queries that access data on a single segment. When on, queries that target rows on a single segment will only have their query plan dispatched to that segment (rather than to all segments). This significantly reduces the response time of qualifying queries as there is no interconnect setup involved. Direct dispatch does require more CPU utilization on the master.

Value Range	Default	Set Classifications
Boolean	on	master, system, restart

gp_enable_fallback_plan

Allows use of disabled plan types when a query would not be feasible without them.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_enable_fast_sri

When set to on, the legacy query optimizer (planner) plans single row inserts so that they are sent directly to the correct segment instance (no motion operation required). This significantly improves performance of single-row-insert statements.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_enable_groupext_distinct_gather

Enables or disables gathering data to a single node to compute distinct-qualified aggregates on grouping extension queries. When this parameter and gp_enable_groupext_distinct_pruning are both enabled, the legacy query optimizer (planner) uses the cheaper plan.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_enable_groupext_distinct_pruning

Enables or disables three-phase aggregation and join to compute distinct-qualified aggregates on grouping extension queries. Usually, enabling this parameter generates a cheaper query plan that the legacy query optimizer (planner) will use in preference to existing plan.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_enable_multiphase_agg

Enables or disables the use of two or three-stage parallel aggregation plans legacy query optimizer (planner). This approach applies to any subquery with aggregation. If gp_enable_multiphase_agg is off, thengp_enable_agg_distinct and gp_enable_agg_distinct_pruning are disabled.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_enable_predicate_propagation

When enabled, the legacy query optimizer (planner) applies query predicates to both table expressions in cases where the tables are joined on their distribution key column(s). Filtering both tables prior to doing the join (when possible) is more efficient.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_enable_preunique

Enables two-phase duplicate removal for SELECT DISTINCT queries (not SELECT COUNT(DISTINCT)). When enabled, it adds an extra SORT DISTINCT set of plan nodes before motioning. In cases where the distinct operation greatly reduces the number of rows, this extra SORT DISTINCT is much cheaper than the cost of sending the rows across the Interconnect.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_enable_sequential_window_plans

If on, enables non-parallel (sequential) query plans for queries containing window function calls. If off, evaluates compatible window functions in parallel and rejoins the results. This is an experimental parameter.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_enable_sort_distinct

Enable duplicates to be removed while sorting.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_enable_sort_limit

Enable LIMIT operation to be performed while sorting. Sorts more efficiently when the plan requires the first limit_number of rows at most.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_external_enable_exec

Enables or disables the use of external tables that execute OS commands or scripts on the segment hosts (CREATE EXTERNAL TABLE EXECUTE syntax).

Value Range	Default	Set Classifications
Boolean	on	master, system, restart

gp_external_grant_privileges

Enables or disables non-superusers to issue a CREATE EXTERNAL [WEB] TABLE command in cases where the LOCATION clause specifies specifies http or gpfdist. The ability to create an external table can be granted to a role using CREATE ROLE or ALTER ROLE.

Value Range	Default	Set Classifications
Boolean	off	master, system, restart

gp_external_max_segs

Sets the number of segments that will scan external table data during an external table operation, the purpose being not to overload the system with scanning data and take away resources from other concurrent operations. This only applies to external tables that use the gpfdist:// protocol to access external table data.

Value Range	Default	Set Classifications
integer	64	master, session, reload

gp_filerep_tcp_keepalives_count

How many keepalives may be lost before the connection is considered dead. A value of 0 uses the system default. If TCP_KEEPCNT is not supported, this parameter must be 0.

Value Range	Default	Set Classifications
number of lost keepalives	2	local, system, restart

gp_filerep_tcp_keepalives_idle

Number of seconds between sending keepalives on an otherwise idle connection. A value of 0 uses the system default. If TCP_KEEPIDLE is not supported, this parameter must be 0.

Value Range	Default	Set Classifications
number of seconds	1 min	local, system, restart

gp_filerep_tcp_keepalives_interval

How many seconds to wait for a response to a keepalive before retransmitting. A value of 0 uses the system default. If TCP_KEEPINTVL is not supported, this parameter must be 0.

Value Range	Default	Set Classifications
number of seconds	30 sec	local, system, restart

gp_hashjoin_tuples_per_bucket

Sets the target density of the hash table used by HashJoin operations. A smaller value will tend to produce larger hash tables, which can increase join performance.

Value Range	Default	Set Classifications
integer	5	master, session, reload

gp_idf_deduplicate

Changes the strategy to compute and process MEDIAN, and PERCENTILE_DISC.

auto

none

force

auto

master, session, reload

gp_interconnect_cache_future_packets

Controls whether future packets are cached.

Value Range	Default	Set Classifications
Boolean	true (on)	master, session, reload

gp_interconnect_default_rtt

Sets the default rtt (in ms) for UDP interconnect.

Value Range	Default	Set Classifications
1-1000ms	20ms	master, session, reload

gp_interconnect_fc_method

Specifies the flow control method used for UDP interconnect when the value of gp_interconnect_type is UDPIFC.

For capacity based flow control, senders do not send packets when receivers do not have the capacity.

Loss based flow control is based on capacity based flow control, and also tunes the sending speed according to packet losses.

Value Range	Default	Set Classifications
CAPACITY LOSS	LOSS	master, session, reload

gp_interconnect_hash_multiplier

Sets the size of the hash table used by the UDP interconnect to track connections. This number is multiplied by the number of segments to determine the number of buckets in the hash table. Increasing the value may increase interconnect performance for complex multi-slice queries (while consuming slightly more memory on the segment hosts).

Value Range	Default	Set Classifications
2-25	2	master, session, reload

gp_interconnect_min_retries_before_timeout

Sets the minimum number of retries before reporting a transmit timeout in the interconnect.

Value Range	Default	Set Classifications
1-4096	100	master, session, reload

gp_interconnect_min_rto

Sets the minimum rto (in ms) for UDP interconnect.

Value Range	Default	Set Classifications
1-1000ms	20ms	master, session, reload

gp_interconnect_queue_depth

Sets the amount of data per-peer to be queued by the UDP interconnect on receivers (when data is received but no space is available to receive it the data will be dropped, and the transmitter will need to resend it). Increasing the depth from its default value will cause the system to use more memory; but may increase performance. It is reasonable for this to be set between 1 and 10. Queries with data skew potentially perform better when this is increased. Increasing this may radically increase the amount of memory used by the system.

Value Range	Default	Set Classifications
1-2048	4	master, session, reload

gp_interconnect_setup_timeout

When the interconnect type is UDP, the time to wait for the Interconnect to complete setup before it times out.

This parameter is used only when gp_interconnect_type is set to UDP.

Warning: The gp_interconnect_setup_timeout server configuration parameter is deprecated and will be removed in a future HAWQ release.

Value Range	Default	Set Classifications
Any valid time expression (number and unit)	2 hours	master, session, reload

gp_interconnect_snd_queue_depth

Sets the amount of data per-peer to be queued by the UDP interconnect on senders. Increasing the depth from its default value will cause the system to use more memory; but may increase performance. Reasonable values for this parameter are between 1 and 4. Increasing the value might radically increase the amount of memory used by the system.

This parameter is used only when gp_interconnect_type is set to UDPIFC.

Value Range	Default	Set Classifications
1 - 4096	2	master, session, reload

gp_interconnect_timer_period

Sets the timer period (in ms) for UDP interconnect.

Value Range	Default	Set Classifications
1-100ms	5ms	master, session, reload

gp_interconnect_timer_checking_period

Sets the timer checking period (in ms) for UDP interconnect.

Value Range	Default	Set Classifications
1-100ms	20ms	master, session, reload

gp_interconnect_transmit_timeout

Timeout (in seconds) on interconnect to transmit a packet.

Value Range	Default	Set Classifications
1-7200s	3600s	master, session, reload

gp_interconnect_type

Sets the networking protocol used for Interconnect traffic. With the TCP protocol, HAWQ has an upper limit of 1000 segment instances - less than that if the query workload involves complex, multi-slice queries.

UDP allows for greater interconnect scalability. Note that the HAWQ software does the additional packet verification and checking not performed by UDP, so reliability and performance is equivalent to TCP.

UDPIFC specifies using UDP with flow control for interconnect traffic. Specify the interconnect flow control method with gp_interconnect_fc_method.

TCP

UDP

UDPIFC

local, system, restart

gp_log_format

Specifies the format of the server log files. If using hawq_toolkit administrative schema, the log files must be in CSV format.

Value Range	Default	Set Classifications
csv text	csv	local, system, restart

gp_max_csv_line_length

The maximum length of a line in a CSV formatted file that will be imported into the system. The default is 1MB (1048576 bytes). Maximum allowed is 4MB (4194184 bytes). The default may need to be increased if using the hawq_toolkit administrative schema to read HAWQ log files.

Value Range	Default	Set Classifications
number of bytes	1048576	local, system, restart

gp_max_databases

The maximum number of databases allowed in a HAWQ system.

Value Range	Default	Set Classifications
integer	16	master, system, restart

gp_max_filespaces

The maximum number of filespaces allowed in a HAWQ system.

Value Range	Default	Set Classifications
integer	8	master, system, restart

gp_max_packet_size

Sets the size (in bytes) of messages sent by the UDP interconnect, and sets the tuple-serialization chunk size for both the UDP and TCP interconnect.

Value Range	Default	Set Classifications
512-65536	8192	master, system, restart

gp_max_plan_size

Specifies the total maximum uncompressed size of a query execution plan multiplied by the number of Motion operators (slices) in the plan. If the size of the query plan exceeds the value, the query is cancelled and an error is returned. A value of 0 means that the size of the plan is not monitored.

You can specify a value in KB,MB, or GB. The default unit is KB. For example, a value of 200 is 200KB. A value of 1GB is the same as 1024MB or 1048576KB.

Value Range	Default	Set Classifications
integer	0	master, superuser, session

gp_max_tablespaces

The maximum number of tablespaces allowed in a HAWQ system.

Value Range	Default	Set Classifications
integer	16	master, system, restart

gp_motion_cost_per_row

Sets the legacy query optimizer (planner) cost estimate for a Motion operator to transfer a row from one segment to another, measured as a fraction of the cost of a sequential page fetch. If 0, then the value used is two times the value of cpu_tuple_cost.

Value Range	Default	Set Classifications
floating point	0	master, session, reload

gp_reject_percent_threshold

For single row error handling on COPY and external table SELECTs, sets the number of rows processed before SEGMENT REJECT LIMIT n PERCENT starts calculating.

Value Range	Default	Set Classifications
integer (1-n)	300	master, session, reload

gp_reraise_signal

If enabled, will attempt to dump core if a fatal server error occurs.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_role

The role of this server process ” set to dispatch for the master and execute for a segment.

dispatch

execute

utility

read only

gp_safefswritesize

Specifies a minimum size for safe write operations to append-only tables in a non-mature file system. When a number of bytes greater than zero is specified, the append-only writer adds padding data up to that number in order to prevent data corruption due to file system errors. Each non-mature file system has a known safe write size that must be specified here when using HAWQ with that type of file system. This is commonly set to a multiple of the extent size of the file system; for example, Linux ext3 is 4096 bytes, so a value of 32768 is commonly used.

Value Range	Default	Set Classifications
integer	0	local, system, restart

gp_segment_connect_timeout

Time that the HAWQ interconnect will try to connect to a segment instance over the network before timing out. Controls the network connection timeout between master and segment replication processes.

Value Range	Default	Set Classifications
Any valid time expression (number and unit)	10min	local, system, reload

gp_segments_for_planner

Sets the number of segment instances for the legacy query optimizer (planner) to assume in its cost and size estimates. If 0, then the value used is the actual number of segments. This variable affects the legacy optimizer’s estimates of the number of rows handled by each sending and receiving process in Motion operators.

Value Range	Default	Set Classifications
integer	0	master, session, reload

gp_session_id

A system assigned ID number for a client session. Starts counting from 1 when the master instance is first started.

Value Range	Default	Set Classifications
integer (1-n)	14	read only

gp_set_proc_affinity

If enabled, when a HAWQ server process (postmaster) is started it will bind to a CPU.

Value Range	Default	Set Classifications
Boolean	off	master, system, restart

gp_set_read_only

Set to on to disable writes to the database. Any in progress transactions must finish before read-only mode takes affect.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

gp_statistics_pullup_from_child_partition

Enables the use of statistics from child tables when planning queries on the parent table by the legacy query optimizer (planner).

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_statistics_use_fkeys

When enabled, allows the legacy query optimizer (planner) to use foreign key information stored in the system catalog to optimize joins between foreign keys and primary keys.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_vmem_idle_resource_timeout

Sets the time in milliseconds a session can be idle before gangs on the segment databases are released to free up resources.

Value Range	Default	Set Classifications
number of milliseconds	18000	master, system, restart

gp_vmem_protect_segworker_cache_limit

If a query executor process consumes more than this configured amount, then the process will not be cached for use in subsequent queries after the process completes. Systems with lots of connections or idle processes may want to reduce this number to free more memory on the segments. Note that this is a local parameter and must be set for every segment.

Value Range	Default	Set Classifications
number of megabytes	500	local, system, restart

gp_workfile_checksumming

Adds a checksum value to each block of a work file (or spill file) used by HashAgg and HashJoin query operators. This adds an additional safeguard from faulty OS disk drivers writing corrupted blocks to disk. When a checksum operation fails, the query will cancel and rollback rather than potentially writing bad data to disk.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

gp_workfile_compress_algorithm

When a hash aggregation or hash join operation spills to disk during query processing, specifies the compression algorithm to use on the spill files.

If your HAWQ database installation uses serial ATA (SATA) disk drives, setting the value of this parameter to zlib might help to avoid overloading the disk subsystem with IO operations.

Value Range	Default	Set Classifications
none zlib	none	master, session, reload

gp_workfile_limit_files_per_query

Sets the maximum number of temporary spill files (also known as workfiles) allowed per query per segment. Spill files are created when executing a query that requires more memory than it is allocated. The current query is terminated when the limit is exceeded.

Set the value to 0 (zero) to allow an unlimited number of spill files. master session reload

Value Range	Default	Set Classifications
integer	3000000	master, session, reload

gp_workfile_limit_per_query

Sets the maximum disk size an individual query is allowed to use for creating temporary spill files at each segment. The default value is 0, which means a limit is not enforced.

Value Range	Default	Set Classifications
kilobytes	0	master, session, reload

gp_workfile_limit_per_segment

Sets the maximum total disk size that all running queries are allowed to use for creating temporary spill files at each segment. The default value is 0, which means a limit is not enforced.

Value Range	Default	Set Classifications
kilobytes	0	local, system, restart

hawq_acl_type

Identifies the authorization method currently configured in your HAWQ cluster. HAWQ supports native and Ranger authorization. Refer to Configuring HAWQ to Use Ranger Policy Management for detailed information on configuring Ranger authorization for HAWQ.

Value Range	Default	Set Classifications
standalone ranger	standalone	master, restart

hawq_dfs_url

URL for HAWQ data directories on HDFS. The directory that you specify must be writeable by the gpadmin user. For example 'localhost:8020/hawq_default’. If you have high availability enabled for your HDFS NameNodes, then this configuration parameter must be set to the service ID you configured in HDFS. See “HAWQ Filespaces and High Availability Enabled HDFS” for more information.

Value Range	Default	Set Classifications
URL in the form of NameNode_host name:port/data_directory_name	localhost:8020/hawq	master, session, reload

hawq_global_rm_type

HAWQ global resource manager type. Valid values are yarn and none. Setting this parameter to none indicates that the HAWQ resource manager will manages its own resources. Setting the value to yarn means that HAWQ will negotiate with YARN for resources.

Value Range	Default	Set Classifications
yarn or none	none	master, system, restart

hawq_master_address_host

Address or hostname of HAWQ master.

Value Range	Default	Set Classifications
master hostname	localhost	master, session, reload

hawq_master_address_port

Port of the HAWQ master.

Value Range	Default	Set Classifications
valid port number		master, session, reload

hawq_master_directory

Master server data directory.

Value Range	Default	Set Classifications
directory name		master, session, reload

hawq_master_temp_directory

One or more temporary directories for the HAWQ master. Separate multiple entries with commas.

Value Range	Default	Set Classifications
directory name or comma-separated list of directory names	/tmp	master, session, reload

hawq_re_memory_overcommit_max

Sets the maximum quota of memory overcommit (in MB) per physical segment for resource enforcement. This parameter sets the memory quota that can be overcommited beyond the memory quota dynamically assigned by the resource manager.

Specify a larger value to prevent out of memory errors in YARN mode.

Value Range	Default	Set Classifications
integer	8192	master, system, restart

hawq_rm_cluster_report_period

Defines the time period, in seconds, for refreshing the global resource manager’s cluster report.

Value Range	Default	Set Classifications
10-100	60	master, session, reload

hawq_rm_force_alterqueue_cancel_queued_request

Instructs HAWQ to cancel by all resource requests that are in conflict with the new resource queue definitions supplied in a ALTER RESOURCE QUEUE statement.

If you set this parameter to false, the actions specified in ALTER RESOURCE QUEUE are canceled if the resource manager finds at least one resource request that is in conflict with the new resource definitions supplied in the altering command.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

hawq_rm_master_port

HAWQ resource manager master port number.

Value Range	Default	Set Classifications
valid port number	5437	master, session, reload

hawq_rm_memory_limit_perseg

Limit of memory usage by a HAWQ segment when hawq_global_rm_type is set to none. For example, 8GB.

Value Range	Default	Set Classifications
no specific lower or upper limit	64GB	session, reload

hawq_rm_min_resource_perseg

This property guarantees that the resource manager will acquire at the least the specified number of YARN containers for each segment no matter how small the resource request is for the initial query. This property cannot exceed YARN’s queue capacity.

Value Range	Default	Set Classifications
0 to 65535	2	master, session, reload

hawq_rm_nresqueue_limit

Maximum number of resource queues that you can create. The minimum number you can create is 3 and the maximum is 1024.

Value Range	Default	Set Classifications
3 to 1024	128	master, session, reload

hawq_rm_nslice_perseg_limit

Limits the total number of slices in one HAWQ segment. HAWQ QE execution generates multiple processes called slices to process data in parallel. Too many processes can affect OS performance. If you need to allow more processes in one segment to run in parallel for high concurrency due to high complexity, then you can modify this parameter.

Value Range	Default	Set Classifications
1 to 65535	5000	master, session, reload

hawq_rm_nvcore_limit_perseg

Maximum number of virtual cores that can be used for query execution in a HAWQ segment when hawq_global_rm_type is set to none. For example, 2.0.

Value Range	Default	Set Classifications
1.0 to maximum integer	1.0	master, session, reload

hawq_rm_nvseg_perquery_limit

Cluster-wide limit on the maximum number of virtual segments that can be used for one query statement execution. The hash buckets defined in default_hash_table_bucket_number cannot exceed this number.

Value Range	Default	Set Classifications
1 to 65535	512	master, session, reload

hawq_rm_nvseg_perquery_perseg_limit

Cluster-wide limit on the maximum number of virtual segments in one HAWQ segment that can be used to execute one query statement. This parameter limits use of virtual segments for random tables, external tables, and user-defined tables, but has no effect on hash tables. If the query contains external tables or user-defined functions, it will allocate the number of segments defined by this parameter (value times segment count). Under most deployment circumstances, you should not modify the default.

Value Range	Default	Set Classifications
1 to 65535	6	master, session, reload

hawq_rm_nvseg_variance_amon_seg_limit

Maximum variance allowed for the number of virtual segments per physical segment used to execute one query statement. Resource manager will try to reallocate resources to avoid going over this limit. This configuration is introduced to ensure that the virtual segments allocated for query execution are dispatched evenly to physical segments.

For example, one query execution causes nine (9) virtual segments to be dispatched to two (2) physical segments. Assume that one segment has been allocated seven (7) virtual segments and another one has allocated two (2) virtual segments. Then the variance between the segments is five (5). If hawq_rm_nvseg_variance_amon_seg_limit is set to the default of one (1), then the allocation of resources for this query is rejected and will be reallocated later. However, if a physical segment has five virtual segments and the other physical segment has four, then this resource allocation is accepted.

Queries perform best when virtual segments allocated for query execution are dispatched as evenly as possible across physical segments in the cluster. Therefore, if you are not concerned with a downgrade in query performance, you can increase the value of this configuration parameter to prevent the rejection of query resource allocations. Another case where you may want to modify this configuration is if you are using YARN to manage HAWQ resources and YARN cannot provide YARN containers to HAWQ evenly among hosts due to busy application workloads. In this case, the latency of resource allocations for queries can be large if this configuration parameter is too small.

Value Range	Default	Set Classifications
0 to 65535	1	master

hawq_rm_rejectrequest_nseg_limit

Maximum number of unavailable physical segments, expressed as a fraction of the entire cluster, that are allowed before the resource manager rejects query resource requests.

Given an expected cluster size (as provided in the slaves file), the HAWQ resource manager calculates the maximum number of unavailable segments allowed before rejecting queries. For example, if hawq_rm_rejectrequest_nseg_limit is set to 0.25 and the expected cluster size is 15, then the maximum number of unavailable segments allowed is 4 (4=ceil(15 * 0.25)). If the number of unavailable HAWQ physical segments exceeds the calculated number (for example, 5 in the preceding scenario) then the resource manager directly rejects query resource requests.

In most deployment cases, you do not need to modify this value.

Note: The value of hawq_rm_rejectrequest_nseg_limit should not exceed the value of hawq_rm_tolerate_nseg_limit.

Value Range	Default	Set Classifications
0.0 to 1.0	0.25	master

hawq_rm_resource_idle_timeout

Amount of time, in seconds, before idle resources are returned to YARN.

Value Range	Default	Set Classifications
Seconds (-1 to 65535)	300	master, session, reload

hawq_rm_return_percent_on_overcommit

Determines how many containers HAWQ should return to the global resource manager (YARN for example.) This configuration only applies when HAWQ’s YARN queue is busy, and HAWQ makes the YARN queue overuse its resources. The default value is 10, which means HAWQ will return 10% of acquired YARN containers by pausing the allocation of resources to HAWQ queries.

In a typical deployment, you do not need to modify the default value of this parameter.

Value Range	Default	Set Classifications
0-100	10	master, session, reload

hawq_rm_segment_heartbeat_interval

Time interval, in seconds, that the segment uses to send a heartbeat to the resource manager. This heartbeat keeps the segment alive and presents the segment’s latest status.

For most deployments, you do not need to modify the default value.

Value Range	Default	Set Classifications
1 to 65535	30	master

hawq_rm_segment_port

HAWQ resource manager segment server port number.

Value Range	Default	Set Classifications
valid port number	5438	master, session, reload

hawq_rm_stmt_nvseg

Defines the number of virtual segments to use for the next query’s execution. The default value is 0, which means statement level resource quota setting is disabled.

Value Range	Default	Set Classifications
0 to 65535	0	master, session, reload

hawq_rm_stmt_vseg_memory

Defines the memory quota of one virtual segment. You must set the number of virtual segments to a number greater than 0 in hawq_rm_stmt_nvseg before specifying the memory quota.

Value Range	Default	Set Classifications
Acceptable values are ‘128mb’, ’256mb’, ’512mb’, ‘1gb’, ‘2gb’, ‘4gb’, ‘8gb’, ‘16gb’	128mb	master, session, reload

hawq_rm_tolerate_nseg_limit

Maximum number of unused physical segments allowed when executing a query statement, expressed as a fraction of the total number of segments cluster. HAWQ checks this limit if a query has at least one physical segment with more than one virtual segments allocated. This configuration ensures that when executing a query in HAWQ, all allocated virtual segments are dispatched as widely as possible across physical segments.

Given an expected cluster size (as indicated in the slaves file), the HAWQ resource manager calculates the tolerated number of unused segments. For example, if hawq_rm_tolerate_nseg_limit is set to 0.25 and the expected cluster size is 15, the tolerated number of unused segments for a query is 4 (4=ceil(15 * 0.25)). If the number of unused segments exceeds the calculated value, the resource manager temporarily gives up on allocating resources for the query and will keep retrying until this limit is no longer exceeded.

Under most deployment circumstances, you do not need to modify the default. Reasons why you would want to increase this number include:

You have an unstable network environment and expect to have a large quantity of unavailable machines. Increase this configuration value to avoid large query resource allocation latency.

Note: The value of hawq_rm_rejectrequest_nseg_limit should not exceed the value of hawq_rm_tolerate_nseg_limit.

Value Range	Default	Set Classifications
0.0 to 1.0	0.25	master

hawq_rm_yarn_address

Server address (host and port) of the YARN resource manager server (the value of yarn.resourcemanager.address). User must define this if hawq_global_rm_type is set to yarn. For example, localhost:8032.

Note: If you have enabled high availability for your YARN resource managers, then you must configure yarn.resourcemanager.ha and yarn.resourcemanager.scheduler.ha in yarn-client.xml located in $GPHOME/etc instead. The values specified for hawq_rm_yarn_address and hawq_rm_yarn_scheduler_address are ignored.

Value Range	Default	Set Classifications
valid hostname and port		master

hawq_rm_yarn_app_name

The name of the YARN application registered with HAWQ’s resource manager. For example, hawq.

Value Range	Default	Set Classifications
string	hawq	master

hawq_rm_yarn_queue_name

The name of the YARN resource queue to register with HAWQ’s resource manager.

Value Range	Default	Set Classifications
string	default	master

hawq_rm_yarn_scheduler_address

Server address (host and port) of the YARN resource manager scheduler (the value of yarn.resourcemanager.scheduler.address). User must define this if hawq_global_rm_type is set to yarn. For example, localhost:8030.

Value Range	Default	Set Classifications
valid hostname and port		master

hawq_rps_address_port

Identifies the port on which the HAWQ Ranger Plug-in Service runs. The hawq_rps_address_port configuration parameter value must match the rps.properties RPS_HTTP_PORT environment variable setting. If you update the port, you must restart your HAWQ cluster, or reload HAWQ configuration and restart the HAWQ Ranger Plug-in Service.

Value Range	Default	Set Classifications
valid port number	8432	master, reload

hawq_rps_check_local_interval

Should the HAWQ master node fail to communicate with the local Ranger Plug-in Service and the Ranger Plug-in Service is active on the standby master node, HAWQ attempts to switch back to the service located on the master node as soon as it becomes available. The HAWQ master periodically attempts to re-establish contact with the service on the local node, using hawq_rps_check_local_interval as the polling time interval (in seconds) for this contact.

Value Range	Default	Set Classifications
1-65563	300	master, reload

hawq_segment_address_port

Base port for the HAWQ segment host.

Value Range	Default	Set Classifications
valid port number		local, session, reload

hawq_segment_directory

Base path for the HAWQ segment data directory.

Value Range	Default	Set Classifications
directory name		local, session, reload

hawq_segment_temp_directory

One or more temporary directories for the HAWQ segment. Separate multiple entries with commas.

Value Range	Default	Set Classifications
directory name or comma-separated list of directory names	/tmp	local, session, reload

integer_datetimes

Reports whether PostgreSQL was built with support for 64-bit-integer dates and times.

Value Range	Default	Set Classifications
Boolean	on	read only

IntervalStyle

Sets the display format for interval values. The value sql_standard produces output matching SQL standard interval literals. The value postgres produces output matching PostgreSQL releases prior to 8.4 when the DateStyle parameter was set to ISO.

The value iso_8601 will produce output matching the time interval format with designators defined in section 4.4.3.2 of ISO 8601. See the PostgreSQL 8.4 documentation for more information.

postgres

postgres_verbose

sql_standard

iso_8601

postgres

master, session, reload

join_collapse_limit

The legacy query optimizer (planner) will rewrite explicit inner JOIN constructs into lists of FROM items whenever a list of no more than this many items in total would result. By default, this variable is set the same as from_collapse_limit, which is appropriate for most uses. Setting it to 1 prevents any reordering of inner JOINs. Setting this variable to a value between 1 and from_collapse_limit might be useful to trade off planning time against the quality of the chosen plan (higher values produce better plans).

Value Range	Default	Set Classifications
integer (1-n)	20	master, session, reload

krb_caseins_users

Sets whether Kerberos user names should be treated case-insensitively. The default is case sensitive (off).

Value Range	Default	Set Classifications
Boolean	off	master, system, restart

krb_server_keyfile

Sets the location of the Kerberos server key file.

Value Range	Default	Set Classifications
path and file name	unset	master, system, restart

krb_srvname

Sets the Kerberos service name.

Value Range	Default	Set Classifications
service name	postgres	master, system, restart

lc_collate

Reports the locale in which sorting of textual data is done. The value is determined when the HAWQ array is initialized.

Value Range	Default	Set Classifications
<system dependent>		read only

lc_ctype

Reports the locale that determines character classifications. The value is determined when the HAWQ array is initialized.

Value Range	Default	Set Classifications
<system dependent>		read only

lc_messages

Sets the language in which messages are displayed. The locales available depends on what was installed with your operating system - use locale -a to list available locales. The default value is inherited from the execution environment of the server. On some systems, this locale category does not exist. Setting this variable will still work, but there will be no effect. Also, there is a chance that no translated messages for the desired language exist. In that case you will continue to see the English messages.

Value Range	Default	Set Classifications
<system dependent>		local, system, restart

lc_monetary

Sets the locale to use for formatting monetary amounts, for example with the to_char family of functions. The locales available depends on what was installed with your operating system - use locale -a to list available locales. The default value is inherited from the execution environment of the server.

Value Range	Default	Set Classifications
<system dependent>		local, system, restart

lc_numeric

Sets the locale to use for formatting numbers, for example with the to_char family of functions. The locales available depends on what was installed with your operating system - use locale -a to list available locales. The default value is inherited from the execution environment of the server.

Value Range	Default	Set Classifications
<system dependent>		local, system, restart

lc_time

This parameter currently does nothing, but may in the future.

Value Range	Default	Set Classifications
<system dependent>		local, system, restart

listen_addresses

Specifies the TCP/IP address(es) on which the server is to listen for connections from client applications - a comma-separated list of host names and/or numeric IP addresses. The special entry * corresponds to all available IP interfaces. If the list is empty, only UNIX-domain sockets can connect.

localhost,

host names,

IP addresses,

* (all available IP interfaces)

master, system, restart

local_preload_libraries

Comma separated list of shared library files to preload at the start of a client session.

Value Range	Default	Set Classifications
string (comma-separated list)	no value set	local, system, restart

log_autostats

Logs information about automatic ANALYZE operations related to gp_autostats_mode.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload, superuser

log_connections

This outputs a line to the server log detailing each successful connection. Some client programs, like psql, attempt to connect twice while determining if a password is required, so duplicate “connection received” messages do not always indicate a problem.

Value Range	Default	Set Classifications
Boolean	off	local, system, restart

log_disconnections

This outputs a line in the server log at termination of a client session, and includes the duration of the session.

Value Range	Default	Set Classifications
Boolean	off	local, system, restart

log_dispatch_stats

When set to “on,” this parameter adds a log message with verbose information about the dispatch of the statement.

Value Range	Default	Set Classifications
Boolean	off	local, system, restart

log_duration

Causes the duration of every completed statement which satisfies log_statement to be logged.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload, superuser

log_error_verbosity

Controls the amount of detail written in the server log for each message that is logged.

TERSE

DEFAULT

VERBOSE

DEFAULT

master, session, reload, superuser

log_executor_stats

For each query, write performance statistics of the query executor to the server log. This is a crude profiling instrument. Cannot be enabled together with log_statement_stats.

Value Range	Default	Set Classifications
Boolean	off	local, system, restart

log_hostname

By default, connection log messages only show the IP address of the connecting host. Turning on this option causes logging of the IP address and host name of the HAWQ master. Note that depending on your host name resolution setup this might impose a non-negligible performance penalty.

Value Range	Default	Set Classifications
Boolean	off	master, system, restart

log_min_duration_statement

Logs the statement and its duration on a single log line if its duration is greater than or equal to the specified number of milliseconds. Setting this to 0 will print all statements and their durations. -1 disables the feature. For example, if you set it to 250 then all SQL statements that run 250ms or longer will be logged. Enabling this option can be useful in tracking down unoptimized queries in your applications.

Value Range	Default	Set Classifications
number of milliseconds, 0, -1	-1	master, session, reload, superuser

log_min_error_statement

Controls whether or not the SQL statement that causes an error condition will also be recorded in the server log. All SQL statements that cause an error of the specified level or higher are logged. The default is PANIC (effectively turning this feature off for normal use). Enabling this option can be helpful in tracking down the source of any errors that appear in the server log.

DEBUG5

DEBUG4

DEBUG3

DEBUG2

DEBUG1

INFO

NOTICE

WARNING

ERROR

FATAL

PANIC

ERROR

master, session, reload, superuser

log_min_messages

Controls which message levels are written to the server log. Each level includes all the levels that follow it. The later the level, the fewer messages are sent to the log.

DEBUG5

DEBUG4

DEBUG3

DEBUG2

DEBUG1

INFO

NOTICE

WARNING

ERROR

FATAL

PANIC

WARNING

master, session, reload, superuser

log_parser_stats

For each query, write performance statistics of the query parser to the server log. This is a crude profiling instrument. Cannot be enabled together with log_statement_stats.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload, superuser

log_planner_stats

For each query, write performance statistics of the legacy query optimizer (planner) to the server log. This is a crude profiling instrument. Cannot be enabled together with log_statement_stats.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload, superuser

log_rotation_age

Determines the maximum lifetime of an individual log file. After this time has elapsed, a new log file will be created. Set to zero to disable time-based creation of new log files.

Value Range	Default	Set Classifications
Any valid time expression (number and unit)	1d	local, system, restart

log_rotation_size

Determines the maximum size of an individual log file. After this many kilobytes have been emitted into a log file, a new log file will be created. Set to zero to disable size-based creation of new log files.

The maximum value is INT_MAX/1024. If an invalid value is specified, the default value is used. INT_MAX is the largest value that can be stored as an integer on your system.

Value Range	Default	Set Classifications
number of kilobytes	0	local, system, restart

log_statement

Controls which SQL statements are logged. DDL logs all data definition commands like CREATE, ALTER, and DROP commands. MOD logs all DDL statements, plus INSERT, TRUNCATE, and COPY FROM. PREPARE and EXPLAIN ANALYZE statements are also logged if their contained command is of an appropriate type.

NONE

DDL

MOD

ALL

master, session, reload, superuser

log_statement_stats

For each query, write total performance statistics of the query parser, planner, and executor to the server log. This is a crude profiling instrument.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload, superuser

log_timezone

Sets the time zone used for timestamps written in the log. Unlike TimeZone, this value is system-wide, so that all sessions will report timestamps consistently. The default is unknown, which means to use whatever the system environment specifies as the time zone.

Value Range	Default	Set Classifications
string	unknown	local, system, restart

log_truncate_on_rotation

Truncates (overwrites), rather than appends to, any existing log file of the same name. Truncation will occur only when a new file is being opened due to time-based rotation. For example, using this setting in combination with a log_filename such as hawq-%H.log would result in generating twenty-four hourly log files and then cyclically overwriting them. When off, pre-existing files will be appended to in all cases.

Value Range	Default	Set Classifications
Boolean	off	local, system, restart

maintenance_work_mem

Sets the maximum amount of memory to be used for maintenance operations including vacuum and create index.

Value Range	Default	Set Classifications
1024KB - (no upper bound)	65536KB	master, session, reload

max_appendonly_tables

Sets the maximum number of append-only relations that can be written to or loaded concurrently. Append-only table partitions and subpartitions are considered as unique tables against this limit. Increasing the limit will allocate more shared memory at server start.

Value Range	Default	Set Classifications
integer	10000	master, system, restart

max_connections

The maximum number of concurrent connections allowed on master. In a HAWQ system, user client connections go through the HAWQ master instance. Segment instances can override this value by specifying a different value in seg_max_connections. The default value for segment connections is 3000, but can be up to 10 times the number of connections allowed for the master instance. For more information about limiting concurrent connections, see Configuring Client Authentication.

Increasing this parameter may cause HAWQ to request more shared memory. See shared_buffers for information about HAWQ server instance shared memory buffers.

Value Range	Default	Set Classifications
integer (10-n)	1280	local, system, restart

max_files_per_process

Sets the maximum number of simultaneously open files allowed to each server subprocess. If the kernel is enforcing a safe per-process limit, you don’t need to worry about this setting. Some platforms such as BSD, the kernel will allow individual processes to open many more files than the system can really support.

Value Range	Default	Set Classifications
integer (40-n)	150	local, system, restart

max_fsm_pages

Sets the maximum number of disk pages for which free space will be tracked in the shared free-space map. Six bytes of shared memory are consumed for each page slot.

Value Range	Default	Set Classifications
integer > 16 * max_fsm_relations	200000	local, system, restart

max_fsm_relations

Sets the maximum number of relations for which free space will be tracked in the shared memory free-space map. Should be set to a value larger than the total number of:

tables + indexes + system tables.

It costs about 60 bytes of memory for each relation per segment instance. It is better to allow some room for overhead and set too high rather than too low.

Value Range	Default	Set Classifications
integer	1000	local, system, restart

max_function_args

Reports the maximum number of function arguments.

Value Range	Default	Set Classifications
integer	100	read only

max_identifier_length

Reports the maximum identifier length.

Value Range	Default	Set Classifications
integer	63	read only

max_index_keys

Reports the maximum number of index keys.

Value Range	Default	Set Classifications
integer	32	read only

max_locks_per_transaction

The shared lock table is created with room to describe locks on max_locks_per_transaction * (max_connections + max_prepared_transactions) objects, so no more than this many distinct objects can be locked at any one time. This is not a hard limit on the number of locks taken by any one transaction, but rather a maximum average value. You might need to raise this value if you have clients that touch many different tables in a single transaction.

Value Range	Default	Set Classifications
integer	128	local, system, restart

max_stack_depth

Specifies the maximum safe depth of the server’s execution stack. The ideal setting for this parameter is the actual stack size limit enforced by the kernel (as set by ulimit -s or local equivalent), less a safety margin of a megabyte or so. Setting the parameter higher than the actual kernel limit will mean that a runaway recursive function can crash an individual backend process.

Value Range	Default	Set Classifications
number of kilobytes	2MB	local, system, restart

optimizer

Enables the GPORCA when running SQL queries. The default is on. When off, HAWQ uses only the legacy query optimizer.

GPORCA co-exists with the legacy query optimizer. When GPORCA is enabled, HAWQ uses GPORCA to generate an execution plan for a query when possible. If GPORCA cannot be used, the legacy query optimizer is used.

The optimizer parameter can be set for a database system, an individual database, or a session or query.

For information about the legacy query optimizer and GPORCA, see About GPORCA.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

optimizer_analyze_root_partition

For a partitioned table, collects statistics for the root partition when the ANALYZE command is run on the table. GPORCA uses the root partition statistics when generating a query plan. The legacy query optimizer does not use these statistics. If you set the value of the server configuration parameter optimizer to on, set the value of this parameter to on and run the command ANALYZE or ANALYZE ROOTPARTITION on partitioned tables to ensure the proper statistics have been collected.

For information about the legacy query optimizer and GPORCA, see About GPORCA.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

optimizer_minidump

GPORCA generates minidump files to describe the optimization context for a given query. Set this parameter to ALWAYS to generate a minidump for all queries.

Note: Set this parameter to ONERROR in production environments to minimize total optimization time.

Value Range	Default	Set Classifications
ONERROR ALWAYS	ONERROR	master, session, reload

optimizer_parts_to_force_sort_on_insert

Sets the minimum number of partitions required to force GPORCA to generate a plan for sorting tuples during insertion into an append-only, row-oriented (AORO) partitioned tables. If the number of partitions is less than the specified value, no sorting is performed.

Value Range	Default	Set Classifications
integer	160	master, session, reload

optimizer_prefer_scalar_dqa_multistage_agg

When enabled, this parameter forces the GPORCA to use a three-stage scalar DQA plan. This type of plan evaluates the aggregate functions in three stages (local, intermediate, and global aggregations). By forcing the use of this plan, GPORCA ensures predictable performance for queries that contain distinct qualified aggregates (DQA).

By default, this configuration is on. If you set this parameter to false, then GPORCA makes a cost-based decision.

Value Range	Default	Set Classifications
Boolean	true (on)	master, session, reload

password_encryption

When a password is specified in CREATE USER or ALTER USER without writing either ENCRYPTED or UNENCRYPTED, this option determines whether the password is to be encrypted.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

password_hash_algorithm

Specifies the cryptographic hash algorithm that is used when storing an encrypted HAWQ Database user password. The default algorithm is MD5.

MD5

SHA-256

MD5

master, session, reload, superuser

pgstat_track_activity_query_size

Sets the maximum length limit for the query text stored in current_query column of the system catalog view pg_stat_activity. The minimum length is 1024 characters.

Value Range	Default	Set Classifications
integer	1024	local, system, restart

pljava_classpath

A colon (:) separated list of the jar files containing the Java classes used in any PL/Java functions. The jar files listed here must also be installed on all HAWQ hosts in the following location: $GPHOME/lib/postgresql/java/

Value Range	Default	Set Classifications
string		master, session, reload

pljava_release_lingering_savepoints

If true, lingering savepoints used in PL/Java functions will be released on function exit. If false, savepoints will be rolled back.

Value Range	Default	Set Classifications
Boolean	true	master, system, restart, superuser

pljava_statement_cache_size

Sets the size in KB of the JRE MRU (Most Recently Used) cache for prepared statements.

Value Range	Default	Set Classifications
number of kilobytes	10	master, system, restart, superuser

pljava_vmoptions

Defines the startup options for the Java VM. The default value is an empty string (“”).

Value Range	Default	Set Classifications
string		master, system, restart, superuser

port

The database listener port for a HAWQ instance. The master and each segment has its own port. Port numbers for the HAWQ system must also be changed in the gp_segment_configuration catalog. You must shut down your HAWQ system before changing port numbers.

Value Range	Default	Set Classifications
any valid port number	5432	local, system, restart

pxf_enable_filter_pushdown

Determines whether partition filtering is enabled. If on, filtering is performed by PXF at the partition level. When off, filtering is performed on the HAWQ side, possibly negatively impacting performance.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

pxf_enable_stat_collection

Determines whether ANALYZE collects statistics for readable PXF tables. If true, an analyze operation on a readable PXF table estimates the number of tuples in the table, generates a sample table and runs statistics queries on the sample table to calculate statistics, as it does for native tables. When false, ANALYZE displays a message to warn that it is skipping the table because pxf_enable_stat_collection is off.

Value Range	Default	Set Classifications
Boolean	true	master, session, reload

pxf_remote_service_login

Temporary configuration parameter identifying the login credential to forward to remote services requiring authentication.

Value Range	Default	Set Classifications
string		master, session, reload

pxf_remote_service_secret

Temporary configuration parameter identifying the password credential to forward to remote services requiring authentication.

Value Range	Default	Set Classifications
string		master, session, reload

pxf_service_address

Specifies the location and port of the PXF service used by HCatalog. To learn more about HCatalog integration with PXF, see Using PXF and HCatalog to Query Hive.

Value Range	Default	Set Classifications
any valid hostname or IP address: any valid port number	localhost:51200	master, session, reload

pxf_service_port

Specifies the listen port for the PXF service when HDFS NameNode High Availability (HA) is enabled. PXF connects to this port when the port number is omitted from the LOCATION clause of a CREATE EXTERNAL TABLE command.

Value Range	Default	Set Classifications
any valid port number	51200	master, session, reload

pxf_stat_max_fragments

The maximum number of fragments to sample during ANALYZE on a PXF table. A lower value provides faster performance for the ANALYZE command, but the sampling is less uniform. A higher value provides more uniform sampling, with slower ANALYZE performance.

Value Range	Default	Set Classifications
number of fragments	100	master, session, reload

random_page_cost

Sets the estimate of the cost of a nonsequentially fetched disk page for the legacy query optimizer (planner). This is measured as a multiple of the cost of a sequential page fetch. A higher value makes it more likely a sequential scan will be used, a lower value makes it more likely an index scan will be used.

Value Range	Default	Set Classifications
floating point	100	master, session, reload

regex_flavor

The 'extended’ setting may be useful for exact backwards compatibility with pre-7.4 releases of PostgreSQL.

advanced

extended

basic

advanced

master, session, reload

runaway_detector_activation_percent

Sets the percentage of the virtual memory quota that will trigger the termination of queries. If the amount of virtual memory utilized by a physical segment exceeds the calculated threshold, then HAWQ begins terminating queries based on memory usage, starting with the query that is consuming the largest amount of memory. Queries are terminated until the percentage of utilized virtual memory is below the specified percentage.

The amount of virtual memory available to a physical segment is calculated dynamically in HAWQ. HAWQ’s resource manager determines resource allocations and quotas based on resources available to the cluster, resource queue configuration and other deployment configuration factors.

To calculate the memory usage threshold for runaway queries, HAWQ uses the following formula:

vmem threshold = (virtual memory quota calculated by resource manager + hawq_re_memory_overcommit_max) * runaway_detector_activation_percent.

For example, if HAWQ resource manager calculates a virtual memory quota of 9GB,hawq_re_memory_overcommit_max is set to 1GB and the value of runaway_detector_activation_percent is 95 (95%), then HAWQ starts terminating queries when the utilized virtual memory exceeds 9.5 GB.

A value of 100 disables the automatic detection and termination of runaway queries.

Value Range	Default	Set Classifications
0 to 100	95	master, session, reload

search_path

Specifies the order in which schemas are searched when an object is referenced by a simple name with no schema component. When there are objects of identical names in different schemas, the one found first in the search path is used. The system catalog schema, pg_catalog, is always searched, whether it is mentioned in the path or not. When objects are created without specifying a particular target schema, they will be placed in the first schema listed in the search path. The current effective value of the search path can be examined via the SQL function current_schemas(). current_schemas() shows how the requests appearing in search_path were resolved.

Value Range	Default	Set Classifications
a comma-separated list of schema names	$user,public	master, session, reload

seg_max_connections

The maximum number of concurrent connections on a segment. In a HAWQ system, user client connections go through the HAWQ master instance. However, seg_max_connections overwrites the master instance value to allow the segments to use a different number of concurrent connections. The default value for best practices is 3000 connections. Segment instances can allow up to 10 times the amount of connections allowed on the master (see max_connections.) For more information about limiting concurrent connections, see Configuring Client Authentication.

Increasing this parameter may cause HAWQ to request more shared memory. See shared_buffers for information about HAWQ server instance shared memory buffers.

Value Range	Default	Set Classifications
integer (240-n)	3000	local, system, restart

seq_page_cost

For the legacy query optimizer (planner), sets the estimate of the cost of a disk page fetch that is part of a series of sequential fetches.

Value Range	Default	Set Classifications
floating point	1	master, session, reload

server_encoding

Reports the database encoding (character set). It is determined when the HAWQ array is initialized. Ordinarily, clients need only be concerned with the value of client_encoding.

Value Range	Default	Set Classifications
<system dependent>	UTF8	read only

server_ticket_renew_interval

Sets the HAWQ HDFS client Kerberos ticket renewal interval in milliseconds.

Value Range	Default	Set Classifications
0-INT_MAX	43200000	master, system, restart

server_version

Reports the version of PostgreSQL that this release of HAWQ is based on.

Value Range	Default	Set Classifications
string	8.2.15	read only

server_version_num

Reports the version of PostgreSQL that this release of HAWQ is based on as an integer.

Value Range	Default	Set Classifications
integer	80215	read only

shared_buffers

Sets the amount of memory a HAWQ segment instance uses for shared memory buffers. This setting must be at least 128KB and at least 16KB times max_connections.

Each HAWQ segment instance calculates and attempts to allocate certain amount of shared memory based on the segment configuration. The value of shared_buffers is significant portion of this shared memory calculation, but is not all it. When setting shared_buffers, the values for the operating system parameters SHMMAX or SHMALL might also need to be adjusted.

The operating system parameter SHMMAX specifies maximum size of a single shared memory allocation. The value of SHMMAX must be greater than this value:

 shared_buffers + other_seg_shmem

The value of other_seg_shmem is the portion the HAWQ shared memory calculation that is not accounted for by the shared_buffers value. The other_seg_shmem value will vary based on the segment configuration.

With the default HAWQ parameter values, the value for other_seg_shmem is approximately 111MB for HAWQ segments and approximately 79MB for the HAWQ master.

The operating system parameter SHMALL specifies the maximum amount of shared memory on the host. The value of SHMALL must be greater than this value:

 (num_instances_per_host * ( shared_buffers + other_seg_shmem )) + other_app_shared_mem

The value of other_app_shared_mem is the amount of shared memory that is used by other applications and processes on the host.

When shared memory allocation errors occur, possible ways to resolve shared memory allocation issues are to increase SHMMAX or SHMALL, or decrease shared_buffers or max_connections.

See HAWQ installation documentation for information about the HAWQ values for the parameters SHMMAX and SHMALL.

Value Range	Default	Set Classifications
integer > 16K * max_connections	125MB	segment, system, restart

shared_preload_libraries

A comma-separated list of shared libraries that are to be preloaded at server start. PostgreSQL procedural language libraries can be preloaded in this way, typically by using the syntax ’$libdir/plXXX’ where XXX is pgsql, perl, tcl, or python. By preloading a shared library, the library startup time is avoided when the library is first used. If a specified library is not found, the server will fail to start.

Value Range	Default	Set Classifications
string (comma-separated list)	no value set	local, system, restart

ssl

Enables SSL connections.

Value Range	Default	Set Classifications
Boolean	off	master, system, restart

ssl_ciphers

Specifies a list of SSL ciphers that are allowed to be used on secure connections. See the openssl manual page for a list of supported ciphers.

Value Range	Default	Set Classifications
string	ALL	master, system, restart

standard_conforming_strings

Reports whether ordinary string literals (’…’) treat backslashes literally, as specified in the SQL standard. The value is currently always off, indicating that backslashes are treated as escapes. It is planned that this will change to on in a future release when string literal syntax changes to meet the standard. Applications may check this parameter to determine how string literals will be processed. The presence of this parameter can also be taken as an indication that the escape string syntax (E’…’) is supported.

Value Range	Default	Set Classifications
Boolean	off	read only

statement_timeout

Abort any statement that takes over the specified number of milliseconds. 0 turns off the limitation.

Value Range	Default	Set Classifications
number of milliseconds	0	master, session, reload

superuser_reserved_connections

Determines the number of connection slots that are reserved for HAWQ superusers.

Value Range	Default	Set Classifications
integer < max_connections	3	local, system, restart

tcp_keepalives_count

How many keepalives may be lost before the connection is considered dead. A value of 0 uses the system default. If TCP_KEEPCNT is not supported, this parameter must be 0.

Value Range	Default	Set Classifications
number of lost keepalives	0	local, system, restart

tcp_keepalives_idle

Number of seconds between sending keepalives on an otherwise idle connection. A value of 0 uses the system default. If TCP_KEEPIDLE is not supported, this parameter must be 0.

Value Range	Default	Set Classifications
number of seconds	0	local, system, restart

tcp_keepalives_interval

How many seconds to wait for a response to a keepalive before retransmitting. A value of 0 uses the system default. If TCP_KEEPINTVL is not supported, this parameter must be 0.

Value Range	Default	Set Classifications
number of seconds	0	local, system, restart

temp_buffers

Sets the maximum number of temporary buffers used by each database session. These are session-local buffers used only for access to temporary tables. The setting can be changed within individual sessions, but only up until the first use of temporary tables within a session. The cost of setting a large value in sessions that do not actually need a lot of temporary buffers is only a buffer descriptor, or about 64 bytes, per increment. However if a buffer is actually used, an additional 8192 bytes will be consumed.

Value Range	Default	Set Classifications
integer	1024	master, session, reload

TimeZone

Sets the time zone for displaying and interpreting time stamps. The default is to use whatever the system environment specifies as the time zone. See Date/Time Keywords in the PostgreSQL documentation.

Value Range	Default	Set Classifications
time zone abbreviation		local, restart

timezone_abbreviations

Sets the collection of time zone abbreviations that will be accepted by the server for date time input. The default is Default, which is a collection that works in most of the world. Australia and India, and other collections can be defined for a particular installation. Possible values are names of configuration files stored in $GPHOME/share/postgresql/timezonesets/.

To configure HAWQ to use a custom collection of timezones, copy the file that contains the timezone definitions to the directory $GPHOME/share/postgresql/timezonesets/ on the HAWQ master and segment hosts. Then set value of the server configuration parameter timezone_abbreviations to the file. For example, to use a file custom that contains the default timezones and the WIB (Waktu Indonesia Barat) timezone.

Copy the file Default from the directory $GPHOME/share/postgresql/timezonesets/ the file custom. Add the WIB timezone information from the file Asia.txt to the custom.
Copy the file custom to the directory $GPHOME/share/postgresql/timezonesets/ on the HAWQ master and segment hosts.
Set value of the server configuration parameter timezone_abbreviations to custom.
Reload the server configuration file (hawq stop cluster -u).

Value Range	Default	Set Classifications
string	Default	master, session, reload

track_activities

Enables the collection of statistics on the currently executing command of each session, along with the time at which that command began execution. When enabled, this information is not visible to all users, only to superusers and the user owning the session. This data can be accessed via the pg_stat_activity system view.

Value Range	Default	Set Classifications
Boolean	on	master, session, reload

track_counts

Enables the collection of row and block level statistics on database activity. If enabled, the data that is produced can be accessed via the pg_stat and pg_statio family of system views.

Value Range	Default	Set Classifications
Boolean	off	local, system, restart

transaction_isolation

Sets the current transaction’s isolation level.

Value Range	Default	Set Classifications
read committed serializable	read committed	master, session, reload

transaction_read_only

Sets the current transaction’s read-only status.

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

transform_null_equals

When on, expressions of the form expr = NULL (or NULL = expr) are treated as expr IS NULL, that is, they return true if expr evaluates to the null value, and false otherwise. The correct SQL-spec-compliant behavior of expr = NULL is to always return null (unknown).

Value Range	Default	Set Classifications
Boolean	off	master, session, reload

unix_socket_directory

Specifies the directory of the UNIX-domain socket on which the server is to listen for connections from client applications.

Value Range	Default	Set Classifications
directory path	unset	local, system, restart

unix_socket_group

Sets the owning group of the UNIX-domain socket. By default this is an empty string, which uses the default group for the current user.

Value Range	Default	Set Classifications
UNIX group name	unset	local, system, restart

unix_socket_permissions

Sets the access permissions of the UNIX-domain socket. UNIX-domain sockets use the usual UNIX file system permission set. Note that for a UNIX-domain socket, only write permission matters.

Value Range	Default	Set Classifications
numeric UNIX file permission mode (as accepted by the chmod or umask commands)	511	local, system, restart

update_process_title

Enables updating of the process title every time a new SQL command is received by the server. The process title is typically viewed by the ps command.

Value Range	Default	Set Classifications
Boolean	on	local, system, restart

vacuum_cost_delay

The length of time that the process will sleep when the cost limit has been exceeded. 0 disables the cost-based vacuum delay feature.

Value Range	Default	Set Classifications
milliseconds < 0 (in multiples of 10)	0	local, system, restart

vacuum_cost_limit

The accumulated cost that will cause the vacuuming process to sleep.

Value Range	Default	Set Classifications
integer > 0	200	local, system, restart

vacuum_cost_page_dirty

The estimated cost charged when vacuum modifies a block that was previously clean. It represents the extra I/O required to flush the dirty block out to disk again.

Value Range	Default	Set Classifications
integer > 0	20	local, system, restart

vacuum_cost_page_miss

The estimated cost for vacuuming a buffer that has to be read from disk. This represents the effort to lock the buffer pool, lookup the shared hash table, read the desired block in from the disk and scan its content.

Value Range	Default	Set Classifications
integer > 0	10	local, system, restart

vacuum_freeze_min_age

Specifies the cutoff age (in transactions) that VACUUM should use to decide whether to replace transaction IDs with FrozenXID while scanning a table.

For information about VACUUM and transaction ID management, see Managing Data with HAWQ and the PostgreSQL documentation.

Value Range	Default	Set Classifications
integer 0-100000000000	100000000	local, system, restart

xid_stop_limit

The number of transaction IDs prior to the ID where transaction ID wraparound occurs. When this limit is reached, HAWQ stops creating new transactions to avoid data loss due to transaction ID wraparound.

Value Range	Default	Set Classifications
integer > 0	20000000	local, system, restart