Rabbitmq监控相关
- 官方文档: http://www.rabbitmq.com/monitoring.html
1. Management UI
启用了rabbitmq_management 插件后,默认在 15672端口打开一个web服务 Management UI,登录进入后可以进行监控和管理。
上图Overview页面即可看到rabbitmq集群的总体运行情况,了解各种容量信息。
- (1) 上半部为当前rabbitmq中间件的总体情况:排队的消息数量,连接、通道、路由、队列等对象的数量
- (2) 节点系统资源使用情况,通过系统资源使用情况可了解集群总体的容量情况。
问题
目前 rabbitmq 中的 队列对象的数量超过了一万多个,vhost 也有 5400多个,连接36000多,Management UI 打开很慢需要等很久。
2. HTTP API
启用了rabbitmq_management 插件后,默认在 15672端口打开一个web服务,该服务除了web管理页面外,还提供有API接口。
2.1. HTTP API说明
- http://rabbitmqhost:15672/api/
- http://rabbitmqhost:15672/doc/stats.html
2.2. 使用范例
GET /api/overview
通过 /api/overview 接口,我们可以了解节点的总体情况。返回结果类似management页面overview的第(1)部分。
$ curl -i -u username:password http://rabbitmqhost:15672/api/overview
HTTP/1.1 200 OK
Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact)
Date: Wed, 10 Oct 2018 00:12:31 GMT
Content-Type: application/json
Content-Length: 2285
Cache-Control: no-cache
{"management_version":"3.6.1","rates_mode":"basic",..............(略)
返回的json串为:
{"management_version":"3.6.1","rates_mode":"basic","exchange_types":[{"name":"fanout","description":"AMQP fanout exchange, as per the AMQP specification","enabled":true},{"name":"headers","description":"AMQP headers exchange, as per the AMQP specification","enabled":true},{"name":"direct","description":"AMQP direct exchange, as per the AMQP specification","enabled":true},{"name":"topic","description":"AMQP topic exchange, as per the AMQP specification","enabled":true}],"rabbitmq_version":"3.6.1","cluster_name":"rabbitmqhost","erlang_version":"18.2","erlang_full_version":"Erlang/OTP 18 [erts-7.2] [source] [64-bit] [smp:64:64] [async-threads:768] [hipe] [kernel-poll:true]","message_stats":{"publish":73583,"publish_details":{"rate":0.0},"ack":3152147,"ack_details":{"rate":0.0},"deliver_get":3156011,"deliver_get_details":{"rate":0.0},"confirm":29238,"confirm_details":{"rate":0.0},"redeliver":285,"redeliver_details":{"rate":0.0},"deliver":364238,"deliver_details":{"rate":0.0},"deliver_no_ack":3174,"deliver_no_ack_details":{"rate":0.0},"get":2788599,"get_details":{"rate":0.0}},"queue_totals":{"messages":120314,"messages_details":{"rate":0.0},"messages_ready":120305,"messages_ready_details":{"rate":0.0},"messages_unacknowledged":9,"messages_unacknowledged_details":{"rate":0.0}},"object_totals":{"consumers":625,"queues":10676,"exchanges":38055,"connections":36187,"channels":1965},"statistics_db_event_queue":4073948,"node":"rabbitmqhost","statistics_db_node":"rabbitmqhost","listeners":[{"node":"rabbit@rabbitmq_cluster01","protocol":"amqp","ip_address":"::","port":5672},{"node":"rabbit@rabbitmq_cluster02","protocol":"amqp","ip_address":"::","port":5672},{"node":"rabbit@rabbitmq_cluster01","protocol":"amqp/ssl","ip_address":"::","port":5671},{"node":"rabbit@rabbitmq_cluster02","protocol":"amqp/ssl","ip_address":"::","port":5671},{"node":"rabbit@rabbitmq_cluster01","protocol":"clustering","ip_address":"::","port":25672},{"node":"rabbit@rabbitmq_cluster02","protocol":"clustering","ip_address":"::","port":25672}],"contexts":[{"node":"rabbit@rabbitmq_cluster01","description":"RabbitMQ Management","path":"/","port":"15672"},{"node":"rabbit@rabbitmq_cluster02","description":"RabbitMQ Management","path":"/","port":"15672"}]}
问题
目前 rabbitmq 中的 队列对象的数量超过了一万多个,vhost 也有 5400多个,连接36000多,management 页面和 HTTP API 的执行效率都很低。打开很慢需要等很久。
3. rabbitmqctl
rabbitmqctl 是 rabbitmq自带的命令行工具,可以实现对rabbitmq进行管理和监控。
3.1. 用法说明
Usage:
rabbitmqctl [-n <node>] [-t <timeout>] [-q] <command> [<command options>]
Options:
-n node
-q
-t timeout
Default node is "rabbit@server", where server is the local host. On a host
named "server.example.com", the node name of the RabbitMQ Erlang node will
usually be rabbit@server (unless RABBITMQ_NODENAME has been set to some
non-default value at broker startup time). The output of hostname -s is usually
the correct suffix to use after the "@" sign. See rabbitmq-server(1) for
details of configuring the RabbitMQ broker.
Quiet output mode is selected with the "-q" flag. Informational messages are
suppressed when quiet mode is in effect.
Operation timeout in seconds. Only applicable to "list" commands. Default is
"infinity".
Commands:
stop [<pid_file>]
stop_app
start_app
wait <pid_file>
reset
force_reset
rotate_logs <suffix>
join_cluster <clusternode> [--ram]
cluster_status
change_cluster_node_type disc | ram
forget_cluster_node [--offline]
rename_cluster_node oldnode1 newnode1 [oldnode2] [newnode2 ...]
update_cluster_nodes clusternode
force_boot
sync_queue [-p <vhost>] queue
cancel_sync_queue [-p <vhost>] queue
purge_queue [-p <vhost>] queue
set_cluster_name name
add_user <username> <password>
delete_user <username>
change_password <username> <newpassword>
clear_password <username>
authenticate_user <username> <password>
set_user_tags <username> <tag> ...
list_users
add_vhost <vhost>
delete_vhost <vhost>
list_vhosts [<vhostinfoitem> ...]
set_permissions [-p <vhost>] <user> <conf> <write> <read>
clear_permissions [-p <vhost>] <username>
list_permissions [-p <vhost>]
list_user_permissions <username>
set_parameter [-p <vhost>] <component_name> <name> <value>
clear_parameter [-p <vhost>] <component_name> <key>
list_parameters [-p <vhost>]
set_policy [-p <vhost>] [--priority <priority>] [--apply-to <apply-to>]
<name> <pattern> <definition>
clear_policy [-p <vhost>] <name>
list_policies [-p <vhost>]
list_queues [-p <vhost>] [<queueinfoitem> ...]
list_exchanges [-p <vhost>] [<exchangeinfoitem> ...]
list_bindings [-p <vhost>] [<bindinginfoitem> ...]
list_connections [<connectioninfoitem> ...]
list_channels [<channelinfoitem> ...]
list_consumers [-p <vhost>]
status
environment
report
eval <expr>
close_connection <connectionpid> <explanation>
trace_on [-p <vhost>]
trace_off [-p <vhost>]
set_vm_memory_high_watermark <fraction>
set_vm_memory_high_watermark absolute <memory_limit>
set_disk_free_limit <disk_limit>
set_disk_free_limit mem_relative <fraction>
<vhostinfoitem> must be a member of the list [name, tracing].
The list_queues, list_exchanges and list_bindings commands accept an optional
virtual host parameter for which to display results. The default value is "/".
<queueinfoitem> must be a member of the list [name, durable, auto_delete,
arguments, policy, pid, owner_pid, exclusive, exclusive_consumer_pid,
exclusive_consumer_tag, messages_ready, messages_unacknowledged, messages,
messages_ready_ram, messages_unacknowledged_ram, messages_ram,
messages_persistent, message_bytes, message_bytes_ready,
message_bytes_unacknowledged, message_bytes_ram, message_bytes_persistent,
head_message_timestamp, disk_reads, disk_writes, consumers,
consumer_utilisation, memory, slave_pids, synchronised_slave_pids, state].
<exchangeinfoitem> must be a member of the list [name, type, durable,
auto_delete, internal, arguments, policy].
<bindinginfoitem> must be a member of the list [source_name, source_kind,
destination_name, destination_kind, routing_key, arguments].
<connectioninfoitem> must be a member of the list [pid, name, port, host,
peer_port, peer_host, ssl, ssl_protocol, ssl_key_exchange, ssl_cipher,
ssl_hash, peer_cert_subject, peer_cert_issuer, peer_cert_validity, state,
channels, protocol, auth_mechanism, user, vhost, timeout, frame_max,
channel_max, client_properties, recv_oct, recv_cnt, send_oct, send_cnt,
send_pend, connected_at].
<channelinfoitem> must be a member of the list [pid, connection, name, number,
user, vhost, transactional, confirm, consumer_count, messages_unacknowledged,
messages_uncommitted, acks_uncommitted, messages_unconfirmed, prefetch_count,
global_prefetch_count].
3.2. rabbitmqctl status
以上命令可以很快返回节点系统总体状态,返回结果里有内存、文件描述符、磁盘空间等资源的使用情况,相当于management页面overview的第(2)部分。
通过对相关内存、文件描述符、磁盘空间等资源的使用情况进行监控,我们可以及时掌握rabbitmq的总体容量情况,当资源不足时及时报警。
$ rabbitmqctl status
Status of node rabbitmqhost ...
[{pid,12345},
{running_applications,
[{rabbitmq_management,"RabbitMQ Management Console","3.6.1"},
{rabbitmq_management_agent,"RabbitMQ Management Agent","3.6.1"},
{rabbit,"RabbitMQ","3.6.1"},
{amqp_client,"RabbitMQ AMQP Client","3.6.1"},
{rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.6.1"},
{webmachine,"webmachine","1.10.3"},
{rabbit_common,[],"3.6.1"},
{mochiweb,"MochiMedia Web Server","2.13.0"},
{ssl,"Erlang/OTP SSL application","7.2"},
{public_key,"Public key infrastructure","1.1"},
{crypto,"CRYPTO","3.6.2"},
{xmerl,"XML parser","1.3.9"},
{os_mon,"CPO CXC 138 46","2.4"},
{mnesia,"MNESIA CXC 138 12","4.13.2"},
{syntax_tools,"Syntax tools","1.7"},
{ranch,"Socket acceptor pool for TCP protocols.","1.2.1"},
{inets,"INETS CXC 138 49","6.1"},
{asn1,"The Erlang ASN1 compiler version 4.0.1","4.0.1"},
{compiler,"ERTS CXC 138 10","6.0.2"},
{sasl,"SASL CXC 138 11","2.6.1"},
{stdlib,"ERTS CXC 138 10","2.7"},
{kernel,"ERTS CXC 138 10","4.1.1"}]},
{os,{unix,linux}},
{erlang_version,
"Erlang/OTP 18 [erts-7.2] [source] [64-bit] [smp:64:64] [async-threads:768] [hipe] [kernel-poll:true]\n"},
{memory,
[{total,5515794552},
{connection_readers,850979240},
{connection_writers,3377232},
{connection_channels,12324200},
{connection_other,1028799136},
{queue_procs,352967664},
{queue_slave_procs,8481944},
{plugins,10036440},
{other_proc,112254856},
{mnesia,107768400},
{mgmt_db,2808},
{msg_index,17941376},
{other_ets,25702016},
{binary,2797202984},
{code,30976673},
{atom,1090729},
{other_system,155888854}]},
{alarms,[]},
{listeners,[{clustering,25672,"::"},{amqp,5672,"::"},{'amqp/ssl',5671,"::"}]},
{vm_memory_high_watermark,0.8},
{vm_memory_limit,216626436505},
{disk_free_limit,135391522816},
{disk_free,222246518784},
{file_descriptors,
[{total_limit,102300},
{total_used,32652},
{sockets_limit,92068},
{sockets_used,28780}]},
{processes,[{limit,1048576},{used,250368}]},
{run_queue,0},
{uptime,6088508},
{kernel,{net_ticktime,60}}]