Analyzing Resource Manager Status

You can use several queries to force the resource manager to dump more details about active resource context status, current resource queue status, and HAWQ segment status.

Connection Track Status

Any query execution requiring resource allocation from HAWQ resource manager has one connection track instance tracking the whole resource usage lifecycle. You can find all resource requests and allocated resources in this dump.

The following is an example query to obtain connection track status:

postgres=# SELECT * FROM dump_resource_manager_status(1);
                              dump_resource_manager_status
----------------------------------------------------------------------------------------
 Dump resource manager connection track status to /tmp/resource_manager_conntrack_status
(1 row)

The following output is an example of resource context (connection track) status.

Number of free connection ids : 65535
Number of connection tracks having requests to handle : 0
Number of connection tracks having responses to send : 0SOCK(client=192.0.2.0:37396:time=2015-11-15-20:54:35.379006),
CONN(id=44:user=role_2:queue=queue2:prog=3:time=2015-11-15-20:54:35.378631:lastact=2015-11-15-20:54:35.378631:
headqueue=2015-11-15-20:54:35.378631),ALLOC(session=89:resource=(1024 MB, 0.250000 CORE)x(1:min=1:act=-1):
slicesize=5:io bytes size=3905568:vseg limit per seg=8:vseg limit per query=1000:fixsegsize=1:reqtime=2015-11-15-20:54:35.379144:
alloctime=2015-11-15-20:54:35.379144:stmt=128 MB x 0),LOC(size=3:host(sdw3:3905568):host(sdw2:3905568):
host(sdw1:3905568)),RESOURCE(hostsize=0),MSG(id=259:size=96:contsize=96:recvtime=1969-12-31-16:00:00.0,
client=192.0.2.0:37396),COMMSTAT(fd=5:readbuffer=0:writebuffer=0
buffers:toclose=false:forceclose=false)
Output Field Description
Number of free connection ids Provides connection track id resource. HAWQ resource manager supports maximum 65536 live connection track instances.
Number of connection tracks having requests to handle Counts the number of requests accepted by resource manager but not processed yet.
Number of connection tracks having responses to send Counts the number of responses generated by resource manager but not sent out yet.
SOCK Provides the request socket connection information.
CONN Provides the information about the role name, target queue, current status of the request:

prog=1 means the connection is established

prog=2 means the connection is registered by role id

prog=3 means the connection is waiting for resource in the target queue

prog=4 means the resource has been allocated to this connection

prog>5 means some failure or abnormal statuses
ALLOC Provides session id information, resource expectation, session level resource limits, statement level resource settings, estimated query workload by slice number, and so on.
LOC Provides query scan HDFS data locality information.
RESOURCE Provides information on the already allocated resource.
MSG Provides the latest received message information.
COMMSTAT Shows current socket communication buffer status.

Resource Queue Status

You can get more details of the status of resource queues.

Besides the information provided in pg_resqueue_status, you can also get YARN resource queue maximum capacity report, total number of HAWQ resource queues, and HAWQ resource queues’ derived resource capacities.

The following is a query to obtain resource queue status:

postgres=# SELECT * FROM dump_resource_manager_status(2);
                            dump_resource_manager_status
-------------------------------------------------------------------------------------
 Dump resource manager resource queue status to /tmp/resource_manager_resqueue_status
(1 row)

The possible output of resource queue status is shown as below.

Maximum capacity of queue in global resource manager cluster 1.000000

Number of resource queues : 4

QUEUE(name=pg_root:parent=NULL:children=3:busy=0:paused=0),
REQ(conn=0:request=0:running=0),
SEGCAP(ratio=4096:ratioidx=-1:segmem=128MB:segcore=0.031250:segnum=1536:segnummax=1536),
QUECAP(memmax=196608:coremax=48.000000:memper=100.000000:mempermax=100.000000:coreper=100.000000:corepermax=100.000000),
QUEUSE(alloc=(0 MB,0.000000 CORE):request=(0 MB,0.000000 CORE):inuse=(0 MB,0.000000 CORE))

QUEUE(name=pg_default:parent=pg_root:children=0:busy=0:paused=0),
REQ(conn=0:request=0:running=0),
SEGCAP(ratio=4096:ratioidx=-1:segmem=1024MB:segcore=0.250000:segnum=38:segnummax=76),
QUECAP(memmax=78643:coremax=19.000000:memper=20.000000:mempermax=40.000000:coreper=20.000000:corepermax=40.000000),
QUEUSE(alloc=(0 MB,0.000000 CORE):request=(0 MB,0.000000 CORE):inuse=(0 MB,0.000000 CORE))
Output Field Description
Maximum capacity of queue in global resource manager cluster YARN maximum capacity report for the resource queue.
Number of resource queues Total number of HAWQ resource queues.
QUEUE Provides basic structural information about the resource queue and whether it is busy dispatching resources to some queries.
REQ Provides concurrency counter and the status of waiting queues.
SEGCAP Provides the virtual segment resource quota and dispatchable number of virtual segments.
QUECAP Provides derived resource queue capacity and actual percentage of the cluster resource a queue can use.
QUEUSE Provides information about queue resource usage.

HAWQ Segment Status

Use the following query to obtain the status of a HAWQ segment.

postgres=# SELECT * FROM dump_resource_manager_status(3);
                           dump_resource_manager_status
-----------------------------------------------------------------------------------
 Dump resource manager resource pool status to /tmp/resource_manager_respool_status
(1 row)

The following output shows the status of a HAWQ segment status. This example describes a host named sdw1 having resource capacity 64GB memory and 16 vcore. It now has 64GB available resource ready for use and 16 containers are held.

HOST_ID(id=0:hostname:sdw1)
HOST_INFO(FTSTotalMemoryMB=65536:FTSTotalCore=16:GRMTotalMemoryMB=0:GRMTotalCore=0)
HOST_AVAILABLITY(HAWQAvailable=true:GLOBAvailable=false)
HOST_RESOURCE(AllocatedMemory=65536:AllocatedCores=16.000000:AvailableMemory=65536:
AvailableCores=16.000000:IOBytesWorkload=0:SliceWorkload=0:LastUpdateTime=1447661681125637:
RUAlivePending=false)
HOST_RESOURCE_CONTAINERSET(ratio=4096:AllocatedMemory=65536:AvailableMemory=65536:
AllocatedCore=16.000000:AvailableCore:16.000000)
        RESOURCE_CONTAINER(ID=0:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=1:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=2:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=3:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=4:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=5:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=6:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=7:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=8:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=9:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=10:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=11:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=12:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=13:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=14:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
        RESOURCE_CONTAINER(ID=15:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
Output Field Description
HOST_ID Provides the recognized segment name and internal id.
HOST_INFO Provides the configured segment resource capacities. GRMTotalMemoryMB and GRMTotalCore shows the limits reported by YARN, FTSTotalMemoryMB and FTSTotalCore show the limits configured in HAWQ.
HOST_AVAILABILITY Shows if the segment is available from HAWQ fault tolerance service (FTS) view or YARN view.
HOST_RESOURCE Shows current allocated and available resource. Estimated workload counters are also shown here.
HOST_RESOURCE_CONTAINERSET Shows each held containers.