TDinsight - Grafana-based Zero-Dependency Monitoring Solution for TDengine

TDinsight is a solution for monitoring TDengine using the builtin native monitoring database and Grafana.

After TDengine starts, it will automatically create a monitoring database log. TDengine will automatically write many metrics in specific intervals into the log database. The metrics may include the server’s CPU, memory, hard disk space, network bandwidth, number of requests, disk read/write speed, slow queries, other information like important system operations (user login, database creation, database deletion, etc.), and error alarms. With Grafana and TDengine Data Source Plugin, TDinsight can visualize cluster status, node information, insertion and query requests, resource usage, vnode, dnode, and mnode status, exception alerts and many other metrics. This is very convenient for developers who want to monitor TDengine cluster status in real-time. This article will guide users to install the Grafana server, automatically install the TDengine data source plug-in, and deploy the TDinsight visualization panel using the TDinsight.sh installation script.

System Requirements

To deploy TDinsight, a single-node TDengine server or a multi-node TDengine cluster and a Grafana server are required. This dashboard requires TDengine 2.3.3.0 and above, with the log database enabled (monitor = 1).

Installing Grafana

We recommend using the latest Grafana version 7 or 8 here. You can install Grafana on any supported operating system by following the official Grafana documentation Instructions to install Grafana.

Installing Grafana on Debian or Ubuntu

For Debian or Ubuntu operating systems, we recommend the Grafana image repository and using the following command to install from scratch.

  1. sudo apt-get install -y apt-transport-https
  2. sudo apt-get install -y software-properties-common wget
  3. wget -q -O - https://packages.grafana.com/gpg.key |\
  4. sudo apt-key add -
  5. echo "deb https://packages.grafana.com/oss/deb stable main" |\
  6. sudo tee -a /etc/apt/sources.list.d/grafana.list
  7. sudo apt-get update
  8. sudo apt-get install grafana

Install Grafana on CentOS / RHEL

You can install it from its official YUM repository.

  1. sudo tee /etc/yum.repos.d/grafana.repo << EOF
  2. [grafana]
  3. name=grafana
  4. baseurl=https://packages.grafana.com/oss/rpm
  5. repo_gpgcheck=1
  6. enabled=1
  7. gpgcheck=1
  8. gpgkey=https://packages.grafana.com/gpg.key
  9. sslverify=1
  10. sslcacert=/etc/pki/tls/certs/ca-bundle.crt
  11. EOF
  12. sudo yum install grafana

Or install it with RPM package.

  1. wget https://dl.grafana.com/oss/release/grafana-7.5.11-1.x86_64.rpm
  2. sudo yum install grafana-7.5.11-1.x86_64.rpm
  3. # or
  4. sudo yum install \
  5. https://dl.grafana.com/oss/release/grafana-7.5.11-1.x86_64.rpm

Automated deployment of TDinsight

We provide an installation script TDinsight.sh to allow users to configure the installation automatically and quickly.

You can download the script via wget or other tools:

  1. wget https://github.com/taosdata/grafanaplugin/releases/latest/download/TDinsight.sh
  2. chmod +x TDinsight.sh
  3. ./TDinsight.sh

This script will automatically download the latest Grafana TDengine data source plugin and TDinsight dashboard with configurable parameters for command-line options to the Grafana Provisioning configuration file to automate deployment and updates, etc. With the alert setting options provided by this script, you can also get built-in support for AliCloud SMS alert notifications.

Assume you use TDengine and Grafana’s default services on the same host. Run . /TDinsight.sh and open the Grafana browser window to see the TDinsight dashboard.

The following is a description of TDinsight.sh usage.

  1. Usage:
  2. ./TDinsight.sh
  3. ./TDinsight.sh -h|--help
  4. ./TDinsight.sh -n <ds-name> -a <api-url> -u <user> -p <password>
  5. Install and configure TDinsight dashboard in Grafana on Ubuntu 18.04/20.04 system.
  6. -h, -help, --help Display help
  7. -V, -verbose, --verbose Run script in verbose mode. Will print out each step of execution.
  8. -v, --plugin-version <version> TDengine datasource plugin version, [default: latest]
  9. -P, --grafana-provisioning-dir <dir> Grafana provisioning directory, [default: /etc/grafana/provisioning/]
  10. -G, --grafana-plugins-dir <dir> Grafana plugins directory, [default: /var/lib/grafana/plugins]
  11. -O, --grafana-org-id <number> Grafana organization id. [default: 1]
  12. -n, --tdengine-ds-name <string> TDengine datasource name, no space. [default: TDengine]
  13. -a, --tdengine-api <url> TDengine REST API endpoint. [default: http://127.0.0.1:6041]
  14. -u, --tdengine-user <string> TDengine user name. [default: root]
  15. -p, --tdengine-password <string> TDengine password. [default: taosdata]
  16. -i, --tdinsight-uid <string> Replace with a non-space ASCII code as the dashboard id. [default: tdinsight]
  17. -t, --tdinsight-title <string> Dashboard title. [default: TDinsight]
  18. -e, --tdinsight-editable If the provisioning dashboard could be editable. [default: false]
  19. -E, --external-notifier <string> Apply external notifier uid to TDinsight dashboard.
  20. Alibaba Cloud SMS as Notifier:
  21. -s, --sms-enabled To enable tdengine-datasource plugin builtin Alibaba Cloud SMS webhook.
  22. -N, --sms-notifier-name <string> Provisioning notifier name.[default: TDinsight Builtin SMS]
  23. -U, --sms-notifier-uid <string> Provisioning notifier uid, use lowercase notifier name by default.
  24. -D, --sms-notifier-is-default Set notifier as default.
  25. -I, --sms-access-key-id <string> Alibaba Cloud SMS access key id
  26. -K, --sms-access-key-secret <string> Alibaba Cloud SMS access key secret
  27. -S, --sms-sign-name <string> Sign name
  28. -C, --sms-template-code <string> Template code
  29. -T, --sms-template-param <string> Template param, a escaped JSON string like '{"alarm_level":"%s","time":"%s","name":"%s","content":"%s"}'
  30. -B, --sms-phone-numbers <string> Comma-separated numbers list, eg "189xxxxxxxx,132xxxxxxxx"
  31. -L, --sms-listen-addr <string> [default: 127.0.0.1:9100]

Most command-line options can take effect the same as environment variables.

Short OptionsLong OptionsEnvironment VariablesDescription
-v—plugin-versionTDENGINE_PLUGIN_VERSIONThe TDengine data source plugin version, the latest version is used by default.
-P—grafana-provisioning-dirGF_PROVISIONING_DIRThe Grafana configuration directory, defaults to /etc/grafana/provisioning/
-G—grafana-plugins-dirGF_PLUGINS_DIRThe Grafana plugin directory, defaults to /var/lib/grafana/plugins.
-O—grafana-org-idGF_ORG_IDThe Grafana organization ID, default is 1.
-n—tdengine-ds-nameTDENGINE_DS_NAMEThe name of the TDengine data source, defaults to TDengine.
-a—tdengine-apiTDENGINE_APIThe TDengine REST API endpoint. Defaults to http://127.0.0.1:6041.
-u—tdengine-userTDENGINE_USERTDengine username. [default: root]
-p—tdengine-passwordTDENGINE_PASSWORDTDengine password. [default: tadosdata]
-i—tdinsight-uidTDINSIGHT_DASHBOARD_UIDTDinsight uid of the dashboard. [default: tdinsight]
-t—tdinsight-titleTDINSIGHT_DASHBOARD_TITLETDinsight dashboard title. [Default: TDinsight]
-e—tdinsight-editableTDINSIGHT_DASHBOARD_EDITABLEIf the dashboard is configured to be editable. [Default: false]
-E—external-notifierEXTERNAL_NOTIFIERApply the external notifier uid to the TDinsight dashboard.
-s—sms-enabledSMS_ENABLEDEnable the tdengine-datasource plugin built into Alibaba Cloud SMS webhook.
-N—sms-notifier-nameSMS_NOTIFIER_NAMEThe name of the provisioning notifier. [Default: TDinsight Builtin SMS]
-U—sms-notifier-uidSMS_NOTIFIER_UID“Notification Channel” uid, lowercase of the program name is used by default, other characters are replaced by “-“.
-D—sms-notifier-is-defaultSMS_NOTIFIER_IS_DEFAULTSet built-in SMS notification to default value.
-I—sms-access-key-idSMS_ACCESS_KEY_IDAlibaba Cloud SMS access key id
-K—sms-access-key-secretSMS_ACCESS_KEY_SECRETAliCloud SMS-access-secret-key
-S—sms-sign-nameSMS_SIGN_NAMESignature
-C—sms-template-codeSMS_TEMPLATE_CODETemplate code
-T—sms-template-paramSMS_TEMPLATE_PARAMJSON template for template parameters
-B—sms-phone-numbersSMS_PHONE_NUMBERSA comma-separated list of phone numbers, e.g. “189xxxxxxxx,132xxxxxxxx”
-L—sms-listen-addrSMS_LISTEN_ADDRBuilt-in SMS webhook listener address, default is 127.0.0.1:9100

Suppose you start a TDengine database on host tdengine with HTTP API port 6041, user root1, and password pass5ord. Execute the script.

  1. sudo . /TDinsight.sh -a http://tdengine:6041 -u root1 -p pass5ord

We provide a “-E” option to configure TDinsight to use the existing Notification Channel from the command line. Assuming your Grafana user and password is admin:admin, use the following command to get the uid of an existing notification channel.

  1. curl --no-progress-meter -u admin:admin http://localhost:3000/api/alert-notifications | jq

Use the uid value obtained above as -E input.

  1. sudo ./TDinsight.sh -a http://tdengine:6041 -u root1 -p pass5ord -E existing-notifier

If you want to use the Alibaba Cloud SMS service as a notification channel, you should enable it with the -s flag add the following parameters.

  • -N: Notification Channel name, default is TDinsight Builtin SMS.
  • -U: Channel uid, default is lowercase of name, any other character is replaced with -, for the default -N, its uid is tdinsight-builtin-sms.
  • -I: Alibaba Cloud SMS access key id.
  • -K: Alibaba Cloud SMS access secret key.
  • -S: Alibaba Cloud SMS signature.
  • -C: Alibaba Cloud SMS template id.
  • -T: Alibaba Cloud SMS template parameters, for JSON format template, example is as follows '{"alarm_level":"%s", "time":"%s", "name":"%s", "content":"%s"}'. There are four parameters: alarm level, time, name and alarm content.
  • -B: a list of phone numbers, separated by a comma ,.

If you want to monitor multiple TDengine clusters, you need to set up numerous TDinsight dashboards. Setting up non-default TDinsight requires some changes: the -n -i -t options need to be changed to non-default names, and -N and -L should also be changed if using the built-in SMS alerting feature.

  1. sudo . /TDengine.sh -n TDengine-Env1 -a http://another:6041 -u root -p taosdata -i tdinsight-env1 -t 'TDinsight Env1'
  2. # If using built-in SMS notifications
  3. sudo . /TDengine.sh -n TDengine-Env1 -a http://another:6041 -u root -p taosdata -i tdinsight-env1 -t 'TDinsight Env1' \
  4. -s -N 'Env1 SMS' -I xx -K xx -S xx -C SMS_XX -T '' -B 00000000000 -L 127.0.0.01:10611

Please note that the configuration data source, notification channel, and dashboard are not changeable on the front end. You should update the configuration again via this script or manually change the configuration file in the /etc/grafana/provisioning directory (this is the default directory for Grafana, use the -P option to change it as needed).

Specifically, -O can be used to set the organization ID when you are using Grafana Cloud or another organization. -G specifies the Grafana plugin installation directory. The -e parameter sets the dashboard to be editable.

Set up TDinsight manually

Install the TDengine data source plugin

Install the latest version of the TDengine Data Source plugin from GitHub.

  1. get_latest_release() {
  2. curl --silent "https://api.github.com/repos/taosdata/grafanaplugin/releases/latest" |
  3. grep '"tag_name":' |
  4. sed -E 's/.*"v([^"]+)".*/\1/'
  5. }
  6. TDENGINE_PLUGIN_VERSION=$(get_latest_release)
  7. sudo grafana-cli \
  8. --pluginUrl https://github.com/taosdata/grafanaplugin/releases/download/v$TDENGINE_PLUGIN_VERSION/tdengine-datasource-$TDENGINE_PLUGIN_VERSION.zip \
  9. plugins install tdengine-datasource
TDinsight - 图1note

The 3.1.6 and earlier version plugins require the following setting in the configuration file /etc/grafana/grafana.ini to enable unsigned plugins.

  1. [plugins]
  2. allow_loading_unsigned_plugins = tdengine-datasource

Start the Grafana service

  1. sudo systemctl start grafana-server
  2. sudo systemctl enable grafana-server

Logging into Grafana

Open the default Grafana URL in a web browser: http://localhost:3000. The default username/password is admin. Grafana will require a password change after the first login.

Adding a TDengine Data Source

Point to the Configurations -> Data Sources menu, and click the Add data source button.

TDengine Database TDinsight Add data source button

Search for and select TDengine.

TDengine Database TDinsight Add datasource

Configure the TDengine datasource.

TDengine Database TDinsight Datasource Configuration

Save and test. It will report ‘TDengine Data source is working’ under normal circumstances.

TDengine Database TDinsight datasource test

Importing dashboards

Point to + / Create - import (or /dashboard/import url).

TDengine Database TDinsight Import Dashboard and Configuration

Type the dashboard ID 15167 in the Import via grafana.com location and Load.

TDengine Database TDinsight Import via grafana.com

Once the import is complete, the full page view of TDinsight is shown below.

TDengine Database TDinsight show

TDinsight dashboard details

The TDinsight dashboard is designed to provide the usage and status of TDengine-related resources dnodes, mnodes, vnodes or databases.

Details of the metrics are as follows.

Cluster Status

TDengine Database TDinsight mnodes overview

This section contains the current information and status of the cluster, the alert information is also here (from left to right, top to bottom).

  • First EP: the firstEp setting in the current TDengine cluster.
  • Version: TDengine server version (leader mnode).
  • Leader Uptime: The time elapsed since the current Leader MNode was elected as Leader.
  • Expire Time - Enterprise version expiration time.
  • Used Measuring Points - The number of measuring points used by the Enterprise Edition.
  • Databases - The number of databases.
  • Connections - The number of current connections.
  • DNodes/MNodes/VGroups/VNodes - Total number of each resource and the number of survivors.
  • DNodes/MNodes/VGroups/VNodes Alive Percent: The ratio of the number of alive/total for each resource, enabling the alert rule and triggering it when the resource liveness rate (the average percentage of healthy resources in 1 minute) is less than 100%.
  • Measuring Points Used: The number of measuring points used to enable the alert rule (no data available in the community version, healthy by default).
  • Grants Expire Time: the expiration time of the enterprise version of the enabled alert rule (no data available for the community version, healthy by default).
  • Error Rate: Aggregate error rate (average number of errors per second) for alert-enabled clusters.
  • Variables: show variables table display.

DNodes Status

TDengine Database TDinsight mnodes overview

  • DNodes Status: simple table view of show dnodes.
  • DNodes Lifetime: the time elapsed since the dnode was created.
  • DNodes Number: the number of DNodes changes.
  • Offline Reason: if any dnode status is offline, the reason for offline is shown as a pie chart.

MNode Overview

TDengine Database TDinsight mnodes overview

  1. MNodes Status: a simple table view of show mnodes.
  2. MNodes Number: similar to DNodes Number, the number of MNodes changes.

Request

TDengine Database TDinsight tdinsight requests

  1. Requests Rate(Inserts per Second): average number of inserts per second.
  2. Requests (Selects): number of query requests and change rate (count of second).
  3. Requests (HTTP): number of HTTP requests and request rate (count of second).

Database

TDengine Database TDinsight database

Database usage, repeated for each value of the variable $database i.e. multiple rows per database.

  1. STables: number of super tables.
  2. Total Tables: number of all tables.
  3. Sub Tables: the number of all super table subtables.
  4. Tables: graph of all normal table numbers over time.
  5. Tables Number Foreach VGroups: The number of tables contained in each VGroups.

DNode Resource Usage

TDengine Database TDinsight dnode usage

Data node resource usage display with repeated multiple rows for the variable $fqdn i.e., each data node. Includes.

  1. Uptime: the time elapsed since the dnode was created.
  2. Has MNodes?: whether the current dnode is a mnode.
  3. CPU Cores: the number of CPU cores.
  4. VNodes Number: the number of VNodes in the current dnode.
  5. VNodes Masters: the number of vnodes in the leader role.
  6. Current CPU Usage of taosd: CPU usage rate of taosd processes.
  7. Current Memory Usage of taosd: memory usage of taosd processes.
  8. Disk Used: The total disk usage percentage of the taosd data directory.
  9. CPU Usage: Process and system CPU usage.
  10. RAM Usage: Time series view of RAM usage metrics.
  11. Disk Used: Disks used at each level of multi-level storage (default is level0).
  12. Disk Increasing Rate per Minute: Percentage increase or decrease in disk usage per minute.
  13. Disk IO: Disk IO rate.
  14. Net IO: Network IO, the aggregate network IO rate in addition to the local network.

Login History

TDengine Database TDinsight Login History

Currently, only the number of logins per minute is reported.

Monitoring taosAdapter

TDengine Database TDinsight monitor taosadapter

Support monitoring taosAdapter request statistics and status details. Includes.

  1. http_request: contains the total number of requests, the number of failed requests, and the number of requests being processed
  2. top 3 request endpoint: data of the top 3 requests by endpoint group
  3. Memory Used: taosAdapter memory usage
  4. latency_quantile(ms): quantile of (1, 2, 5, 9, 99) stages
  5. top 3 failed request endpoint: data of the top 3 failed requests by endpoint grouping
  6. CPU Used: taosAdapter CPU usage

Upgrade

TDinsight installed via the TDinsight.sh script can be upgraded to the latest Grafana plugin and TDinsight Dashboard by re-running the script.

In the case of a manual installation, follow the steps above to install the new Grafana plugin and Dashboard yourself.

Uninstall

TDinsight installed via the TDinsight.sh script can be cleaned up using the command line TDinsight.sh -R to clean up the associated resources.

To completely uninstall TDinsight during a manual installation, you need to clean up the following.

  1. the TDinsight Dashboard in Grafana.
  2. the Data Source in Grafana.
  3. remove the tdengine-datasource plugin from the plugin installation directory.

Integrated Docker Example

  1. git clone --depth 1 https://github.com/taosdata/grafanaplugin.git
  2. cd grafanaplugin

Change as needed in the docker-compose.yml file to

  1. version: '3.7'
  2. services:
  3. grafana:
  4. image: grafana/grafana:7.5.10
  5. volumes:
  6. - . /dist:/var/lib/grafana/plugins/tdengine-datasource
  7. - . /grafana/grafana.ini:/etc/grafana/grafana.ini
  8. - . /grafana/provisioning/:/etc/grafana/provisioning/
  9. - grafana-data:/var/lib/grafana
  10. environment:
  11. TDENGINE_API: ${TDENGINE_API}
  12. TDENGINE_USER: ${TDENGINE_USER}
  13. TDENGINE_PASS: ${TDENGINE_PASS}
  14. SMS_ACCESS_KEY_ID: ${SMS_ACCESS_KEY_ID}
  15. SMS_ACCESS_KEY_SECRET: ${SMS_ACCESS_KEY_SECRET}
  16. SMS_SIGN_NAME: ${SMS_SIGN_NAME}
  17. SMS_TEMPLATE_CODE: ${SMS_TEMPLATE_CODE}
  18. SMS_TEMPLATE_PARAM: '${SMS_TEMPLATE_PARAM}'
  19. SMS_PHONE_NUMBERS: $SMS_PHONE_NUMBERS
  20. SMS_LISTEN_ADDR: ${SMS_LISTEN_ADDR}
  21. ports:
  22. - 3000:3000
  23. volumes:
  24. grafana-data:

Replace the environment variables in docker-compose.yml or save the environment variables to the .env file, then start Grafana with docker-compose up. See Docker Compose Reference

  1. docker-compose up -d

Then the TDinsight was deployed via Provisioning. Go to http://localhost:3000/d/tdinsight/ to view the dashboard.