使用方法

使用方法

用户可以通过命令行客户端atune-adm使用A-Tune提供的功能。本章介绍A-Tune客户端包含的功能和使用方法。

总体说明

使用A-Tune需要使用root权限。
atune-adm支持的命令可以通过 atune-adm help/—help/-h 查询。
使用方法中所有命令的使用举例都是在单机部署模式下，如果是在分布式部署模式下，需要指定服务器IP和端口号，例如：
```
#  atune-adm -a 192.168.3.196 -p 60001 list
```
define、update、undefine、collection、train、upgrade不支持远程执行。
命令格式中，[ ] 表示参数可选，<> 表示参数必选，具体参数由实际情况确定。

查询负载类型

list

功能描述

查询系统当前支持的profile，以及当前处于active状态的profile。

命令格式

atune-adm list

使用示例

# atune-adm list 
Support profiles:
+------------------------------------------------+-----------+
| ProfileName                                    | Active    |
+================================================+===========+
| arm-native-android-container-robox             | false     |
+------------------------------------------------+-----------+
| basic-test-suite-euleros-baseline-fio          | false     |
+------------------------------------------------+-----------+
| basic-test-suite-euleros-baseline-lmbench      | false     |
+------------------------------------------------+-----------+
| basic-test-suite-euleros-baseline-netperf      | false     |
+------------------------------------------------+-----------+
| basic-test-suite-euleros-baseline-stream       | false     |
+------------------------------------------------+-----------+
| basic-test-suite-euleros-baseline-unixbench    | false     |
+------------------------------------------------+-----------+
| basic-test-suite-speccpu-speccpu2006           | false     |
+------------------------------------------------+-----------+
| basic-test-suite-specjbb-specjbb2015           | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-hdfs-dfsio-hdd                 | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-hdfs-dfsio-ssd                 | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-bayesian                 | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-kmeans                   | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql1                     | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql10                    | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql2                     | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql3                     | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql4                     | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql5                     | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql6                     | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql7                     | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql8                     | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql9                     | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-tersort                  | false     |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-wordcount                | false     |
+------------------------------------------------+-----------+
| cloud-compute-kvm-host                         | false     |
+------------------------------------------------+-----------+
| database-mariadb-2p-tpcc-c3                    | false     |
+------------------------------------------------+-----------+
| database-mariadb-4p-tpcc-c3                    | false     |
+------------------------------------------------+-----------+
| database-mongodb-2p-sysbench                   | false     |
+------------------------------------------------+-----------+
| database-mysql-2p-sysbench-hdd                 | false     |
+------------------------------------------------+-----------+
| database-mysql-2p-sysbench-ssd                 | false     |
+------------------------------------------------+-----------+
| database-postgresql-2p-sysbench-hdd            | false     |
+------------------------------------------------+-----------+
| database-postgresql-2p-sysbench-ssd            | false     |
+------------------------------------------------+-----------+
| default-default                                | false     |
+------------------------------------------------+-----------+
| docker-mariadb-2p-tpcc-c3                      | false     |
+------------------------------------------------+-----------+
| docker-mariadb-4p-tpcc-c3                      | false     |
+------------------------------------------------+-----------+
| hpc-gatk4-human-genome                         | false     |
+------------------------------------------------+-----------+
| in-memory-database-redis-redis-benchmark       | false     |
+------------------------------------------------+-----------+
| middleware-dubbo-dubbo-benchmark               | false     |
+------------------------------------------------+-----------+
| storage-ceph-vdbench-hdd                       | false     |
+------------------------------------------------+-----------+
| storage-ceph-vdbench-ssd                       | false     |
+------------------------------------------------+-----------+
| virtualization-consumer-cloud-olc              | false     |
+------------------------------------------------+-----------+
| virtualization-mariadb-2p-tpcc-c3              | false     |
+------------------------------------------------+-----------+
| virtualization-mariadb-4p-tpcc-c3              | false     |
+------------------------------------------------+-----------+
| web-apache-traffic-server-spirent-pingpo       | false     |
+------------------------------------------------+-----------+
| web-nginx-http-long-connection                 | true      |
+------------------------------------------------+-----------+
| web-nginx-https-short-connection               | false     |
+------------------------------------------------+-----------+

说明：
Active为true表示当前激活的profile，示例表示当前激活的profile是web-nginx-http-long-connection。

分析负载类型并自优化

analysis

功能描述

采集系统的实时统计数据进行负载类型识别，并进行自动优化。

命令格式

atune-adm analysis [OPTIONS]

参数说明

OPTIONS

参数

 描述

 —model, -m

用户自训练产生的新模型

 —characterization, -c

使用默认的模型进行应用识别，不进行自动优化

参数	描述
—model, -m	用户自训练产生的新模型
—characterization, -c	使用默认的模型进行应用识别，不进行自动优化

使用示例

使用默认的模型进行应用识别
```
# atune-adm analysis --characterization
```
使用默认的模型进行应用识别，并进行自动优化
```
# atune-adm analysis
```

使用自训练的模型进行应用识别

# atune-adm analysis --model /usr/libexec/atuned/analysis/models/new-model.m

自定义模型

A-Tune支持用户定义并学习新模型。定义新模型的操作流程如下：

用define命令定义一个新应用的profile
用collection命令收集应用对应的系统数据
用train命令训练得到模型

define

功能描述

添加用户自定义的应用场景，及对应的profile优化项。

命令格式

atune-adm define

使用示例

新增一个profile，service_type的名称为test_service，application_name的名称为test_app，scenario_name的名称为test_scenario，优化项的配置文件为example.conf。

# atune-adm define test_service test_app test_scenario ./example.conf

example.conf 可以参考如下方式书写（以下各优化项非必填，仅供参考），也可通过atune-adm info查看已有的profile是如何书写的。

 [main]
 # list its parent profile
 [kernel_config]
 # to change the kernel config
 [bios]
 # to change the bios config
 [bootloader.grub2]
 # to change the grub2 config
 [sysfs]
 # to change the /sys/* config
 [systemctl]
 # to change the system service status
 [sysctl]
 # to change the /proc/sys/* config
 [script]
 # the script extention of cpi
 [ulimit]
 # to change the resources limit of user
 [schedule_policy]
 # to change the schedule policy
 [check]
 # check the environment
 [tip]
 # the recommended optimization, which should be performed manunaly

collection

功能描述

采集业务运行时系统的全局资源使用情况以及OS的各项状态信息，并将收集的结果保存到csv格式的输出文件中，作为模型训练的输入数据集。

说明：

本命令依赖采样工具perf，mpstat，vmstat，iostat，sar。
CPU型号目前仅支持鲲鹏920，可通过dmidecode -t processor检查CPU型号。

命令格式

atune-adm collection

参数说明

OPTIONS

参数	描述
—filename, -f	生成的用于训练的csv文件名：名称-时间戳.csv
—output_path, -o	生成的csv文件的存放路径，需提供绝对路径
—disk, -b	业务运行时实际使用的磁盘，如/dev/sda
—network, -n	业务运行时使用的网络接口，如eth0
—app_type, -t	标记业务的应用类型，作为训练时使用的标签
—duration, -d	业务运行时采集数据的时间，单位秒，默认采集时间1200秒
—interval，-i	采集数据的时间间隔，单位秒，默认采集间隔5秒

使用示例

# atune-adm collection --filename name --interval 5 --duration 1200 --output_path /home/data --disk sda --network eth0 --app_type test_type

train

功能描述

使用采集的数据进行模型的训练。训练时至少采集两种应用类型的数据，否则训练会出错。

命令格式

atune-adm train

参数说明

OPTIONS

参数

 描述

 —data_path, -d

存放模型训练所需的csv文件的目录

 —output_file, -o

训练生成的新模型

参数	描述
—data_path, -d	存放模型训练所需的csv文件的目录
—output_file, -o	训练生成的新模型

使用示例

使用data目录下的csv文件作为训练输入，生成的新模型new-model.m存放在model目录下。

# atune-adm train --data_path /home/data --output_file /usr/libexec/atuned/analysis/models/new-model.m

undefine

功能描述

删除用户自定义的profile。

命令格式

atune-adm undefine

使用示例

删除自定义的profile。

# atune-adm undefine test_service-test_app-test_scenario

查询profile

info

功能描述

查看对应的profile内容。

命令格式

atune-adm info

使用示例

查看web-nginx-http-long-connection的profile内容：

# atune-adm info web-nginx-http-long-connection
*** web-nginx-http-long-connection:
#
# nginx http long connection A-Tune configuration
#
[main]
include = default-default
[kernel_config]
#TODO CONFIG
[bios]
#TODO CONFIG
[bootloader.grub2]
iommu.passthrough = 1
[sysfs]
#TODO CONFIG
[systemctl]
sysmonitor = stop
irqbalance = stop
[sysctl]
fs.file-max = 6553600
fs.suid_dumpable = 1
fs.aio-max-nr = 1048576
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_local_port_range = 1024     65500
net.ipv4.tcp_max_tw_buckets = 5000
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 262144
net.ipv4.tcp_max_orphans = 262144
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_synack_retries = 1
net.ipv4.tcp_syn_retries = 1
net.ipv4.tcp_fin_timeout = 1
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_mem =  362619      483495   725238
net.ipv4.tcp_rmem = 4096         87380   6291456
net.ipv4.tcp_wmem = 4096         16384   4194304
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
[script]
prefetch = off
ethtool =  -X {network} hfunc toeplitz
[ulimit]
{user}.hard.nofile = 102400
{user}.soft.nofile = 102400
[schedule_policy]
#TODO CONFIG
[check]
#TODO CONFIG
[tip]
SELinux provides extra control and security features to linux kernel. Disabling SELinux will improve the performance but may cause security risks. = kernel
disable the nginx log = application

更新profile

用户根据需要更新已有profile。

update

功能描述

将已有profile中原来的优化项更新为new.conf中的内容。

命令格式

atune-adm update

使用示例

更新名为test_service-test_app-test_scenario的profile优化项为new.conf。

# atune-adm update test_service-test_app-test_scenario ./new.conf

激活profile

profile

功能描述

手动激活profile，使其处于active状态。

命令格式

atune-adm profile

参数说明

profile名参考list命令查询结果。

使用示例

激活web-nginx-http-long-connection对应的profile配置。

# atune-adm profile web-nginx-http-long-connection

回滚profile

rollback

功能描述

回退当前的配置到系统的初始配置。

命令格式

atune-adm rollback

使用示例

# atune-adm rollback

更新数据库

upgrade

功能描述

更新系统的数据库。

命令格式

atune-adm upgrade

参数说明

DB_FILE

新的数据库文件路径

使用示例

数据库更新为new_sqlite.db。

# atune-adm upgrade ./new_sqlite.db

系统信息查询

check

功能描述

检查系统当前的cpu、bios、os、网卡等信息。

命令格式

atune-adm check

使用示例

# atune-adm check
 cpu information:
     cpu:0   version: Kunpeng 920-6426  speed: 2600000000 HZ   cores: 64
     cpu:1   version: Kunpeng 920-6426  speed: 2600000000 HZ   cores: 64
 system information:
     DMIBIOSVersion: 0.59
     OSRelease: 4.19.36-vhulk1906.3.0.h356.eulerosv2r8.aarch64
 network information:
     name: eth0              product: HNS GE/10GE/25GE RDMA Network Controller
     name: eth1              product: HNS GE/10GE/25GE Network Controller
     name: eth2              product: HNS GE/10GE/25GE RDMA Network Controller
     name: eth3              product: HNS GE/10GE/25GE Network Controller
     name: eth4              product: HNS GE/10GE/25GE RDMA Network Controller
     name: eth5              product: HNS GE/10GE/25GE Network Controller
     name: eth6              product: HNS GE/10GE/25GE RDMA Network Controller
     name: eth7              product: HNS GE/10GE/25GE Network Controller
     name: docker0           product:

参数自调优

A-Tune提供了最佳配置的自动搜索能力，免去人工反复做参数调整、性能评价的调优过程，极大地提升最优配置的搜寻效率。

tuning

功能描述

使用指定的项目文件对参数进行动态空间的搜索，找到当前环境配置下的最优解。

命令格式

说明：
在运行命令前，需要满足如下条件：

服务端的yaml配置文件已经编辑完成并放置于 atuned服务下的/etc/atuned/tuning/目录中。
客户端的yaml配置文件已经编辑完成并放置于atuned客户端任意目录下。

atune-adm tuning [OPTIONS]

参数说明

OPTIONS

参数	描述
—restore, -r	恢复tuning优化前的初始配置
—project, -p	指定需要恢复的yaml文件中的项目名称
—restart, -c	基于历史调优结果进行调优
—detail, -d	打印tuning过程的详细信息

说明：
当使用参数时，-p参数后需要跟具体的项目名称且必须指定该项目yaml文件。

PROJECT_YAML：客户端yaml配置文件。

配置说明

表 1 服务端yaml文件

配置名称	配置说明	参数类型	取值范围
project	项目名称。	字符串	-
startworkload	待调优服务的启动脚本。	字符串	-
stopworkload	待调优服务的停止脚本。	字符串	-
maxiterations	最大调优迭代次数，用于限制客户端的迭代次数。一般来说，调优迭代次数越多，优化效果越好，但所需时间越长。用户必须根据实际的业务场景进行配置。	整型	>10
object	需要调节的参数项及信息。 object 配置项请参见表2。	-	-

表 2 object项配置说明

配置名称	配置说明	参数类型	取值范围
name	待调参数名称	字符串	-
desc	待调参数描述	字符串	-
get	查询参数值的脚本	-	-
set	设置参数值的脚本	-	-
needrestart	参数生效是否需要重启业务	枚举	“true”, “false”
type	参数的类型，目前支持discrete, continuous两种类型，对应离散型、连续型参数	枚举	“discrete”, “continuous”
dtype	该参数仅在type为discrete类型时配置，目前支持int, float, string类型	枚举	int, float, string
scope	参数设置范围，仅在type为discrete且dtype为int或float时或者type为continuous时生效	整型/浮点型	用户自定义，取值在该参数的合法范围
step	参数值步长，dtype为int或float时使用	整型/浮点型	用户自定义
items	参数值在scope定义范围之外的枚举值，dtype为int或float时使用	整型/浮点型	用户自定义，取值在该参数的合法范围
options	参数值的枚举范围，dtype为string时使用	字符串	用户自定义，取值在该参数的合法范围

表 3 客户端yaml文件配置说明

配置名称	配置说明	参数类型	取值范围
project	项目名称，需要与服务端对应配置文件中的project匹配	字符串	-
engine	调优算法	字符串	“random”, “forest”, “gbrt”, “bayes”, “extraTrees”
iterations	调优迭代次数	整型	>=10
random_starts	随机迭代次数	整型	<iterations
feature_filter_engine	参数搜索算法，用于重要参数选择，该参数可选	字符串	“lhs”
feature_filter_cycle	参数搜索轮数，用于重要参数选择，该参数配合feature_filter_engine使用	整型	-
feature_filter_iters	每轮参数搜索的迭代次数，用于重要参数选择，该参数配合feature_filter_engine使用	整型	-
split_count	调优参数取值范围中均匀选取的参数个数，用于重要参数选择，该参数配合feature_filter_engine使用	整型	-
benchmark	性能测试脚本	-	-
evaluations	性能测试评估指标 evaluations 配置项请参见表4	-	-

表 4 evaluations项配置说明

配置名称	配置说明	参数类型	取值范围
name	评价指标名称	字符串	-
get	获取性能评估结果的脚本	-	-
type	评估结果的正负类型，positive代表最小化性能值，negative代表最大化对应性能值	枚举	“positive”,”negative”
weight	该指标的权重百分比，0-100	整型	0-100
threshold	该指标的最低性能要求	整型	用户指定

配置示例

服务端yaml文件配置示例：

project: "compress"
maxiterations: 500
startworkload: ""
stopworkload: ""
object :
  -
    name : "compressLevel"
    info :
        desc : "The compresslevel parameter is an integer from 1 to 9 controlling the level of compression"
        get : "cat /root/A-Tune/examples/tuning/compress/compress.py | grep 'compressLevel=' | awk -F '=' '{print $2}'"
        set : "sed -i 's/compressLevel=\\s*[0-9]*/compressLevel=$value/g' /root/A-Tune/examples/tuning/compress/compress.py"
        needrestart : "false"
        type : "continuous"
        scope :
          - 1
          - 9
        dtype : "int"
  -
    name : "compressMethod"
    info :
        desc : "The compressMethod parameter is a string controlling the compression method"
        get : "cat /root/A-Tune/examples/tuning/compress/compress.py | grep 'compressMethod=' | awk -F '=' '{print $2}' | sed 's/\"//g'"
        set : "sed -i 's/compressMethod=\\s*[0-9,a-z,\"]*/compressMethod=\"$value\"/g' /root/A-Tune/examples/tuning/compress/compress.py"
        needrestart : "false"
        type : "discrete"
        options :
          - "bz2"
          - "zlib"
          - "gzip"
        dtype : "string"

客户端yaml文件配置示例：

project: "compress"
engine : "gbrt"
iterations : 20
random_starts : 10
benchmark : "python3 /root/A-Tune/examples/tuning/compress/compress.py"
evaluations :
  -
    name: "time"
    info:
        get: "echo '$out' | grep 'time' | awk '{print $3}'"
        type: "positive"
        weight: 20
  -
    name: "compress_ratio"
    info:
        get: "echo '$out' | grep 'compress_ratio' | awk '{print $3}'"
        type: "negative"
        weight: 80

使用示例

进行tuning调优

# atune-adm tuning --project compress --detail compress_client.yaml

恢复tuning调优前的初始配置，compress为yaml文件中的项目名称
```
# atune-adm tuning --restore --project compress
```