一、Metrics介绍

应用性能监控 - 图1

  1. 基于netflix spectator
  2. Foundation-metrics通过SPI机制加载所有MetricsInitializer实现,实现者可以通过MetricsInitializer中的getOrder规划执行顺序,order数字越小,越先执行。
  3. Metrics-core实现3类MetricsInitializer:
    1. DefaultRegistryInitializer: 实例化并注册spectator-reg-servo,设置较小的order,保证比下面2类MetricsInitializer先执行
    2. Meters Initializer: 实现TPS、时延、线程池、jvm资源等等数据的统计
    3. Publisher: 输出统计结果,内置了日志输出,以及通过RESTful接口输出
  4. Metrics-prometheus提供与prometheus对接的能力

二、使用方法

1.Maven依赖

  1. <dependency>
  2. <groupId>org.apache.servicecomb</groupId>
  3. <artifactId>metrics-core</artifactId>
  4. </dependency>

如果与prometheus集成,则还需要加入依赖

  1. <dependency>
  2. <groupId>org.apache.servicecomb</groupId>
  3. <artifactId>metrics-prometheus</artifactId>
  4. </dependency>

注:请将version字段修改为实际版本号;如果版本号已经在dependencyManagement中声明,则这里不必写版本号

2.配置说明

配置项默认值含义
servicecomb.metrics.window_time60000统计周期,单位为毫秒
TPS、时延等等周期性的数据,每周期更新一次,在周期内获取到的值,实际是上一周期的值
servicecomb.metrics
.invocation.latencyDistribution
时延分布时间段定义,单位为毫秒
例如:0,1,10,100,1000
表示定义了下列时延段[0, 1),[1, 10),[10, 100),[100, 1000),[1000, )
servicecomb.metrics
.Consumer.invocation.slow.enabled
false是否开启Consumer端的慢调用检测
通过增加后缀.${service}.${schema}.${operation},可以支持4级优先级定义
servicecomb.metrics
.Consumer.invocation.slow.msTime
1000时延超过配置值,则会立即输出日志,记录本次调用的stage耗时信息
通过增加后缀.${service}.${schema}.${operation},可以支持4级优先级定义
servicecomb.metrics
.Provider.invocation.slow.enabled
false是否开启Provide端的慢调用检测
通过增加后缀.${service}.${schema}.${operation},可以支持4级优先级定义
servicecomb.metrics
.Provider.invocation.slow.msTime
1000时延超过配置值,则会立即输出日志,记录本次调用的stage耗时信息
通过增加后缀.${service}.${schema}.${operation},可以支持4级优先级定义
servicecomb.metrics
.prometheus.address
0.0.0.0:9696prometheus监听地址
servicecomb.metrics.publisher.defaultLog
.enabled
false是否输出默认的统计日志
servicecomb.metrics.publisher.defaultLog
.endpoints.client.detail.enabled
false是否输出每一条client endpoint统计日志,因为跟目标的ip:port数有关,可能会有很多数据,所以默认不输出

3.慢调用检测

开启慢调用检测后,如果存在慢调用,则会立即输出相应日志:

  1. 2019-04-02 23:01:09,103[WARN][pool-7-thread-74][5ca37935c00ff2c7-350076] - slow(40 ms) invocation, CONSUMER highway perf1.impl.syncQuery
  2. http method: GET
  3. url : /v1/syncQuery/{id}/
  4. server : highway://192.168.0.152:7070?login=true
  5. status code: 200
  6. total : 50.760 ms
  7. prepare : 0.0 ms
  8. handlers request : 0.0 ms
  9. client filters request : 0.0 ms
  10. send request : 0.5 ms
  11. get connection : 0.0 ms
  12. write to buf : 0.5 ms
  13. wait response : 50.727 ms
  14. wake consumer : 0.23 ms
  15. client filters response: 0.2 ms
  16. handlers response : 0.0 ms (SlowInvocationLogger.java:121)

其中5ca37935c00ff2c7-350076是${traceId}-${invocationId}的结构,在log4j2或logback的输出格式中通过%marker引用

4.通过RESTful访问

只要微服务开放了rest端口,则使用浏览器访问http://ip:port/metrics 即可, 将会得到类似下面格式的json数据:

  1. {
  2. "servicecomb.vertx.endpoints(address=192.168.0.124:7070,statistic=connectCount,type=client)": 0.0,
  3. "servicecomb.vertx.endpoints(address=192.168.0.124:7070,statistic=disconnectCount,type=client)": 0.0,
  4. "servicecomb.vertx.endpoints(address=192.168.0.124:7070,statistic=connections,type=client)": 1.0,
  5. "servicecomb.vertx.endpoints(address=192.168.0.124:7070,statistic=bytesRead,type=client)": 508011.0,
  6. "servicecomb.vertx.endpoints(address=192.168.0.124:7070,statistic=bytesWritten,type=client)": 542163.0,
  7. "servicecomb.vertx.endpoints(address=192.168.0.124:7070,statistic=queueCount,type=client)": 0.0,
  8. "servicecomb.vertx.endpoints(address=0.0.0.0:7070,statistic=connectCount,type=server)": 0.0,
  9. "servicecomb.vertx.endpoints(address=0.0.0.0:7070,statistic=disconnectCount,type=server)": 0.0,
  10. "servicecomb.vertx.endpoints(address=0.0.0.0:7070,statistic=connections,type=server)": 1.0,
  11. "servicecomb.vertx.endpoints(address=0.0.0.0:7070,statistic=bytesRead,type=server)": 542163.0,
  12. "servicecomb.vertx.endpoints(address=0.0.0.0:7070,statistic=bytesWritten,type=server)": 508011.0,
  13. "servicecomb.vertx.endpoints(address=0.0.0.0:7070,statistic=rejectByConnectionLimit,type=server)": 0.0,
  14. "servicecomb.vertx.endpoints(address=localhost:8080,statistic=connectCount,type=server)": 0.0,
  15. "servicecomb.vertx.endpoints(address=localhost:8080,statistic=disconnectCount,type=server)": 0.0,
  16. "servicecomb.vertx.endpoints(address=localhost:8080,statistic=connections,type=server)": 0.0,
  17. "servicecomb.vertx.endpoints(address=localhost:8080,statistic=bytesRead,type=server)": 0.0,
  18. "servicecomb.vertx.endpoints(address=localhost:8080,statistic=bytesWritten,type=server)": 0.0,
  19. "servicecomb.vertx.endpoints(address=localhost:8080,statistic=rejectByConnectionLimit,type=server)": 0.0,
  20. "threadpool.completedTaskCount(id=cse.executor.groupThreadPool-group0)": 4320.0,
  21. "threadpool.rejectedCount(id=cse.executor.groupThreadPool-group0)": 0.0,
  22. "threadpool.taskCount(id=cse.executor.groupThreadPool-group0)": 4320.0,
  23. "threadpool.currentThreadsBusy(id=cse.executor.groupThreadPool-group0)": 0.0,
  24. "threadpool.poolSize(id=cse.executor.groupThreadPool-group0)": 4.0,
  25. "threadpool.maxThreads(id=cse.executor.groupThreadPool-group0)": 10.0,
  26. "threadpool.queueSize(id=cse.executor.groupThreadPool-group0)": 0.0,
  27. "threadpool.corePoolSize(id=cse.executor.groupThreadPool-group0)": 4.0,
  28. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,scope=[0,1),status=200,transport=highway,type=latencyDistribution)": 4269.0,
  29. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,scope=[1,3),status=200,transport=highway,type=latencyDistribution)": 0.0,
  30. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,scope=[3,10),status=200,transport=highway,type=latencyDistribution)": 0.0,
  31. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,scope=[10,100),status=200,transport=highway,type=latencyDistribution)": 0.0,
  32. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,scope=[100,),status=200,transport=highway,type=latencyDistribution)": 0.0,
  33. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,scope=[0,1),status=200,transport=highway,type=latencyDistribution)": 4269.0,
  34. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,scope=[1,3),status=200,transport=highway,type=latencyDistribution)": 0.0,
  35. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,scope=[3,10),status=200,transport=highway,type=latencyDistribution)": 0.0,
  36. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,scope=[10,100),status=200,transport=highway,type=latencyDistribution)": 0.0,
  37. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,scope=[100,),status=200,transport=highway,type=latencyDistribution)": 0.0,
  38. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=total,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  39. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=total,statistic=totalTime,status=200,transport=highway,type=stage)": 0.25269420000000004,
  40. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=total,statistic=max,status=200,transport=highway,type=stage)": 2.7110000000000003E-4,
  41. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=handlers_request,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  42. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=handlers_request,statistic=totalTime,status=200,transport=highway,type=stage)": 0.0079627,
  43. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=handlers_request,statistic=max,status=200,transport=highway,type=stage)": 1.74E-5,
  44. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=handlers_response,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  45. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=handlers_response,statistic=totalTime,status=200,transport=highway,type=stage)": 0.0060666,
  46. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=handlers_response,statistic=max,status=200,transport=highway,type=stage)": 1.08E-5,
  47. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=prepare,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  48. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=prepare,statistic=totalTime,status=200,transport=highway,type=stage)": 0.016679600000000003,
  49. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=prepare,statistic=max,status=200,transport=highway,type=stage)": 2.68E-5,
  50. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=queue,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  51. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=queue,statistic=totalTime,status=200,transport=highway,type=stage)": 0.08155480000000001,
  52. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=queue,statistic=max,status=200,transport=highway,type=stage)": 2.1470000000000001E-4,
  53. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=execution,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  54. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=execution,statistic=totalTime,status=200,transport=highway,type=stage)": 0.0098285,
  55. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=execution,statistic=max,status=200,transport=highway,type=stage)": 4.3100000000000004E-5,
  56. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=server_filters_request,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  57. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=server_filters_request,statistic=totalTime,status=200,transport=highway,type=stage)": 0.0170669,
  58. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=server_filters_request,statistic=max,status=200,transport=highway,type=stage)": 3.6400000000000004E-5,
  59. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=server_filters_response,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  60. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=server_filters_response,statistic=totalTime,status=200,transport=highway,type=stage)": 0.0196985,
  61. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=server_filters_response,statistic=max,status=200,transport=highway,type=stage)": 4.8100000000000004E-5,
  62. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=producer_send_response,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  63. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=producer_send_response,statistic=totalTime,status=200,transport=highway,type=stage)": 0.0880885,
  64. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=PRODUCER,stage=producer_send_response,statistic=max,status=200,transport=highway,type=stage)": 1.049E-4,
  65. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=total,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  66. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=total,statistic=totalTime,status=200,transport=highway,type=stage)": 0.9796976000000001,
  67. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=total,statistic=max,status=200,transport=highway,type=stage)": 6.720000000000001E-4,
  68. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=handlers_request,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  69. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=handlers_request,statistic=totalTime,status=200,transport=highway,type=stage)": 0.012601500000000002,
  70. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=handlers_request,statistic=max,status=200,transport=highway,type=stage)": 3.5000000000000004E-5,
  71. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=handlers_response,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  72. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=handlers_response,statistic=totalTime,status=200,transport=highway,type=stage)": 0.0066785,
  73. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=handlers_response,statistic=max,status=200,transport=highway,type=stage)": 3.21E-5,
  74. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=prepare,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  75. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=prepare,statistic=totalTime,status=200,transport=highway,type=stage)": 0.010363800000000001,
  76. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=prepare,statistic=max,status=200,transport=highway,type=stage)": 2.85E-5,
  77. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=client_filters_request,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  78. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=client_filters_request,statistic=totalTime,status=200,transport=highway,type=stage)": 0.0060282,
  79. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=client_filters_request,statistic=max,status=200,transport=highway,type=stage)": 9.2E-6,
  80. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_send_request,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  81. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_send_request,statistic=totalTime,status=200,transport=highway,type=stage)": 0.099984,
  82. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_send_request,statistic=max,status=200,transport=highway,type=stage)": 1.1740000000000001E-4,
  83. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_get_connection,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  84. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_get_connection,statistic=totalTime,status=200,transport=highway,type=stage)": 0.006916800000000001,
  85. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_get_connection,statistic=max,status=200,transport=highway,type=stage)": 5.83E-5,
  86. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_write_to_buf,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  87. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_write_to_buf,statistic=totalTime,status=200,transport=highway,type=stage)": 0.0930672,
  88. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_write_to_buf,statistic=max,status=200,transport=highway,type=stage)": 1.1580000000000001E-4,
  89. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_wait_response,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  90. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_wait_response,statistic=totalTime,status=200,transport=highway,type=stage)": 0.7654931,
  91. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_wait_response,statistic=max,status=200,transport=highway,type=stage)": 5.547E-4,
  92. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_wake_consumer,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  93. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_wake_consumer,statistic=totalTime,status=200,transport=highway,type=stage)": 0.0502085,
  94. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=consumer_wake_consumer,statistic=max,status=200,transport=highway,type=stage)": 3.7370000000000003E-4,
  95. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=client_filters_response,statistic=count,status=200,transport=highway,type=stage)": 4269.0,
  96. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=client_filters_response,statistic=totalTime,status=200,transport=highway,type=stage)": 0.0227188,
  97. "servicecomb.invocation(operation=perf1.impl.syncQuery,role=CONSUMER,stage=client_filters_response,statistic=max,status=200,transport=highway,type=stage)": 4.0E-5
  98. }

三、统计项汇总

1. CPU

NameTag keysTag values含义
ostypecpu当前周期内系统CPU使用率,Solaris模式
processCpu当前周期内微服务进程CPU使用率,IRIX模式
processCpu除以cpu近似等于系统CPU数

2. NET

NameTag keysTag values含义
ostypenet
statisticsend当前周期内平均每秒发送的字节数(Bps)
receive当前周期内平均每秒接收的字节数(Bps)
sendPackets当前周期内平均每秒发送的包数(pps)
receivePackets当前周期内平均每秒接收的包数(pps)
interface网卡设备名

3. vertx client endpoints

NameTag keysTag values含义
servicecomb
.vertx
.endpoints
typeclient
address${ip}:${port}服务端的ip:port
statisticconnectCount当前周期内共发起多少次连接
disconnectCount当前周期内断连的次数
queueCounthttp连接池中正在等待获取连接的请求数
connections当前时刻的连接数
bytesRead当前周期内平均每秒发送的字节数(Bps)
业务层的统计,相对从网卡获取的数据,这里的数据不包括包头的大小
对于http消息,不包括http header大小
bytesWritten当前周期内平均每秒接收的字节数(Bps)
业务层的统计,相对从网卡获取的数据,这里的数据不包括包头的大小
对于http消息,不包括http header大小

4. vertx server endpoints

NameTag keysTag values含义
servicecomb
.vertx
.endpoints
typeserver
address${ip}:${port}监听的ip:port
statisticconnectCount当前周期内共接入多少次连接
disconnectCount当前周期内断连的次数
rejectByConnectionLimit当前周期内因超出连接数限制而主动断连的次数
connections当前时刻的连接数
bytesRead当前周期内平均每秒发送的字节数(Bps)
业务层的统计,相对从网卡获取的数据,这里的数据不包括包头的大小
对于http消息,不包括http header大小
bytesWritten当前周期内平均每秒接收的字节数(Bps)
业务层的统计,相对从网卡获取的数据,这里的数据不包括包头的大小
对于http消息,不包括http header大小

5. invocation 时延分布

NameTag keysTag values含义
servicecomb
.invocation
roleCONSUMER、PRODUCER、EDGE是CONSUMER、PRODUCER还是EDGE端的统计
operation${microserviceName}
.${schemaId}
.${operationName}
调用的方法名
transporthighway或rest调用是在哪个传输通道上发生的
statushttp status code
typelatencyDistribution调用时延分布
scope[${min}, ${max})当前周期内调用时延大于等于min,小于max的次数
[${min},)表示max为无限大

6. invocation consumer stage时延

NameTag keysTag values含义
servicecomb
.invocation
roleCONSUMERCONSUMER端的统计
operation${microserviceName}
.${schemaId}
.${operationName}
调用的方法名
transporthighway或rest调用是在哪个传输通道上发生的
statushttp status code
typestagestage时延
stagetotal全流程
prepare初始化
handlers_requesthandler链请求流程
client_filters_requesthttp client filter链请求流程
只有走rest transport才有本阶段
consumer_send_request发送请求阶段,包括consumer_get_connection和consumer_write_to_buf
consumer_get_connection从连接池获取连接
consumer_write_to_buf向网络缓冲区写数据
consumer_wait_response等待服务端应答
consumer_wake_consumer同步流程中,收到应答后,从唤醒等待线程,到等待线程开始处理应答的耗时
client_filters_responsehttp client filter链应答流程
handlers_responsehandler链应答流程
statisticcount平均每秒调用次数,即TPS
count=统计周期内的调用次数/周期(秒)
totalTime单位为秒
totalTime=当前周期内的调用耗时总时长/周期(秒)
totalTime除以count即可得到平均时延
max单位为秒
当前周期内最大耗时

7. invocation producer stage时延

NameTag keysTag values含义
servicecomb
.invocation
rolePRODUCERPRODUCER端的统计
operation${microserviceName}
.${schemaId}
.${operationName}
调用的方法名
transporthighway或rest调用是在哪个传输通道上发生的
statushttp status code
typestagestage时延
stagetotal全流程
prepare初始化
queue仅在使用线程池时有意义
表示调用在线程池中排队的时长
server_filters_requesthttp server filter链请求流程
只有走rest transport才有本阶段
handlers_requesthandler链请求流程
execution业务方法
handlers_responsehandler链应答流程
server_filters_responsehttp server filter链应答流程
producer_send_response发送应答
statisticcount平均每秒调用次数,即TPS
count=统计周期内的调用次数/周期(秒)
totalTime单位为秒
totalTime=当前周期内的调用耗时总时长/周期(秒)
totalTime除以count即可得到平均时延
max单位为秒
当前周期内最大耗时

8. invocation edge stage时延

NameTag keysTag values含义
servicecomb
.invocation
roleEDGEEDGE的统计
operation${microserviceName}
.${schemaId}
.${operationName}
调用的方法名
transporthighway或rest调用是在哪个传输通道上发生的
statushttp status code
typestagestage时延
stagetotal全流程
prepare初始化
queue仅在使用线程池时有意义
表示调用在线程池中排队的时长
server_filters_requesthttp server filter链请求流程
handlers_requesthandler链请求流程
client_filters_requesthttp client filter链请求流程
consumer_send_request发送请求阶段,包括consumer_get_connection和consumer_write_to_buf
consumer_get_connection从连接池获取连接
consumer_write_to_buf向网络缓冲区写数据
consumer_wait_response等待服务端应答
consumer_wake_consumer同步流程中,收到应答后,从唤醒等待线程,到等待线程开始处理应答的耗时
client_filters_responsehttp client filter链应答流程
handlers_responsehandler链应答流程
server_filters_responsehttp server filter链应答流程
producer_send_response发送应答
statisticcount平均每秒调用次数,即TPS
count=统计周期内的调用次数/周期(秒)
totalTime单位为秒
totalTime=当前周期内的调用耗时总时长/周期(秒)
totalTime除以count即可得到平均时延
max单位为秒
当前周期内最大耗时

9. threadpool

NameTag keysTag values含义
threadpool.corePoolSizeid${threadPoolName}最小线程数
threadpool.maxThreads最大允许的线程数
threadpool.poolSize当前实际线程数
threadpool.currentThreadsBusy当前的活动线程数,即当前正在执行的任务数
threadpool.queueSize当前正在排队的任务数
threadpool.rejectedCount当前周期内平均每秒被拒绝的任务数
threadpool.taskCount统计周期内平均每秒提交的任务数
taskCount=(completed + queue + active)/周期(秒)
threadpool.completedTaskCount统计周期内平均每秒完成的任务数
completedTaskCount=completed/周期(秒)

四、业务定制

因为ServiceComb已经初始化了servo的registry,所以业务不必再创建registry

实现MetricsInitializer接口,定义业务级的Meters,或实现定制的Publisher,再通过SPI机制声明自己的实现即可。

1.Meters:

创建Meters能力均由spectator提供,可查阅netflix spectator文档

2.Publisher:

周期性输出的场景,比如日志场景,通过eventBus订阅org.apache.servicecomb.foundation.metrics.PolledEvent,PolledEvent.getMeters()即是本周期的统计结果 非周期性输出的场景,比如通过RESTful接口访问,通过globalRegistry.iterator()即可得到本周期的统计结果