Analyzing Resource Manager Status
You can use several queries to force the resource manager to dump more details about active resource context status, current resource queue status, and HAWQ segment status.
Connection Track Status
Any query execution requiring resource allocation from HAWQ resource manager has one connection track instance tracking the whole resource usage lifecycle. You can find all resource requests and allocated resources in this dump.
The following is an example query to obtain connection track status:
postgres=# SELECT * FROM dump_resource_manager_status(1);
dump_resource_manager_status
----------------------------------------------------------------------------------------
Dump resource manager connection track status to /tmp/resource_manager_conntrack_status
(1 row)
The following output is an example of resource context (connection track) status.
Number of free connection ids : 65535
Number of connection tracks having requests to handle : 0
Number of connection tracks having responses to send : 0SOCK(client=192.0.2.0:37396:time=2015-11-15-20:54:35.379006),
CONN(id=44:user=role_2:queue=queue2:prog=3:time=2015-11-15-20:54:35.378631:lastact=2015-11-15-20:54:35.378631:
headqueue=2015-11-15-20:54:35.378631),ALLOC(session=89:resource=(1024 MB, 0.250000 CORE)x(1:min=1:act=-1):
slicesize=5:io bytes size=3905568:vseg limit per seg=8:vseg limit per query=1000:fixsegsize=1:reqtime=2015-11-15-20:54:35.379144:
alloctime=2015-11-15-20:54:35.379144:stmt=128 MB x 0),LOC(size=3:host(sdw3:3905568):host(sdw2:3905568):
host(sdw1:3905568)),RESOURCE(hostsize=0),MSG(id=259:size=96:contsize=96:recvtime=1969-12-31-16:00:00.0,
client=192.0.2.0:37396),COMMSTAT(fd=5:readbuffer=0:writebuffer=0
buffers:toclose=false:forceclose=false)
Output Field | Description |
---|---|
Number of free connection ids | Provides connection track id resource. HAWQ resource manager supports maximum 65536 live connection track instances. |
Number of connection tracks having requests to handle | Counts the number of requests accepted by resource manager but not processed yet. |
Number of connection tracks having responses to send | Counts the number of responses generated by resource manager but not sent out yet. |
SOCK | Provides the request socket connection information. |
CONN | Provides the information about the role name, target queue, current status of the request:prog=1 means the connection is establishedprog=2 means the connection is registered by role idprog=3 means the connection is waiting for resource in the target queueprog=4 means the resource has been allocated to this connectionprog>5 means some failure or abnormal statuses |
ALLOC | Provides session id information, resource expectation, session level resource limits, statement level resource settings, estimated query workload by slice number, and so on. |
LOC | Provides query scan HDFS data locality information. |
RESOURCE | Provides information on the already allocated resource. |
MSG | Provides the latest received message information. |
COMMSTAT | Shows current socket communication buffer status. |
Resource Queue Status
You can get more details of the status of resource queues.
Besides the information provided in pg_resqueue_status, you can also get YARN resource queue maximum capacity report, total number of HAWQ resource queues, and HAWQ resource queues’ derived resource capacities.
The following is a query to obtain resource queue status:
postgres=# SELECT * FROM dump_resource_manager_status(2);
dump_resource_manager_status
-------------------------------------------------------------------------------------
Dump resource manager resource queue status to /tmp/resource_manager_resqueue_status
(1 row)
The possible output of resource queue status is shown as below.
Maximum capacity of queue in global resource manager cluster 1.000000
Number of resource queues : 4
QUEUE(name=pg_root:parent=NULL:children=3:busy=0:paused=0),
REQ(conn=0:request=0:running=0),
SEGCAP(ratio=4096:ratioidx=-1:segmem=128MB:segcore=0.031250:segnum=1536:segnummax=1536),
QUECAP(memmax=196608:coremax=48.000000:memper=100.000000:mempermax=100.000000:coreper=100.000000:corepermax=100.000000),
QUEUSE(alloc=(0 MB,0.000000 CORE):request=(0 MB,0.000000 CORE):inuse=(0 MB,0.000000 CORE))
QUEUE(name=pg_default:parent=pg_root:children=0:busy=0:paused=0),
REQ(conn=0:request=0:running=0),
SEGCAP(ratio=4096:ratioidx=-1:segmem=1024MB:segcore=0.250000:segnum=38:segnummax=76),
QUECAP(memmax=78643:coremax=19.000000:memper=20.000000:mempermax=40.000000:coreper=20.000000:corepermax=40.000000),
QUEUSE(alloc=(0 MB,0.000000 CORE):request=(0 MB,0.000000 CORE):inuse=(0 MB,0.000000 CORE))
Output Field | Description |
---|---|
Maximum capacity of queue in global resource manager cluster | YARN maximum capacity report for the resource queue. |
Number of resource queues | Total number of HAWQ resource queues. |
QUEUE | Provides basic structural information about the resource queue and whether it is busy dispatching resources to some queries. |
REQ | Provides concurrency counter and the status of waiting queues. |
SEGCAP | Provides the virtual segment resource quota and dispatchable number of virtual segments. |
QUECAP | Provides derived resource queue capacity and actual percentage of the cluster resource a queue can use. |
QUEUSE | Provides information about queue resource usage. |
HAWQ Segment Status
Use the following query to obtain the status of a HAWQ segment.
postgres=# SELECT * FROM dump_resource_manager_status(3);
dump_resource_manager_status
-----------------------------------------------------------------------------------
Dump resource manager resource pool status to /tmp/resource_manager_respool_status
(1 row)
The following output shows the status of a HAWQ segment status. This example describes a host named sdw1
having resource capacity 64GB memory and 16 vcore. It now has 64GB available resource ready for use and 16 containers are held.
HOST_ID(id=0:hostname:sdw1)
HOST_INFO(FTSTotalMemoryMB=65536:FTSTotalCore=16:GRMTotalMemoryMB=0:GRMTotalCore=0)
HOST_AVAILABLITY(HAWQAvailable=true:GLOBAvailable=false)
HOST_RESOURCE(AllocatedMemory=65536:AllocatedCores=16.000000:AvailableMemory=65536:
AvailableCores=16.000000:IOBytesWorkload=0:SliceWorkload=0:LastUpdateTime=1447661681125637:
RUAlivePending=false)
HOST_RESOURCE_CONTAINERSET(ratio=4096:AllocatedMemory=65536:AvailableMemory=65536:
AllocatedCore=16.000000:AvailableCore:16.000000)
RESOURCE_CONTAINER(ID=0:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=1:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=2:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=3:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=4:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=5:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=6:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=7:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=8:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=9:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=10:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=11:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=12:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=13:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=14:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
RESOURCE_CONTAINER(ID=15:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
Output Field | Description |
---|---|
HOST_ID | Provides the recognized segment name and internal id. |
HOST_INFO | Provides the configured segment resource capacities. GRMTotalMemoryMB and GRMTotalCore shows the limits reported by YARN, FTSTotalMemoryMB and FTSTotalCore show the limits configured in HAWQ. |
HOST_AVAILABILITY | Shows if the segment is available from HAWQ fault tolerance service (FTS) view or YARN view. |
HOST_RESOURCE | Shows current allocated and available resource. Estimated workload counters are also shown here. |
HOST_RESOURCE_CONTAINERSET | Shows each held containers. |