- CMake in ClickHouse
- TL; DR How to make ClickHouse compile and link faster?
- CMake files types
- List of CMake flags
- Developer’s guide for adding new CMake options
- Don’t be obvious. Be informative.
- If the option’s state could produce unwanted (or unusual) result, explicitly warn the user.
- In the option’s description, explain WHAT the option does rather than WHY it does something.
- Don’t assume other developers know as much as you do.
- Prefer consistent default values.
CMake in ClickHouse
TL; DR How to make ClickHouse compile and link faster?
Minimal ClickHouse build example:
cmake .. \
-DCMAKE_C_COMPILER=$(which clang-11) \
-DCMAKE_CXX_COMPILER=$(which clang++-11) \
-DCMAKE_BUILD_TYPE=Debug \
-DENABLE_CLICKHOUSE_ALL=OFF \
-DENABLE_CLICKHOUSE_SERVER=ON \
-DENABLE_CLICKHOUSE_CLIENT=ON \
-DENABLE_LIBRARIES=OFF \
-DUSE_UNWIND=ON \
-DENABLE_UTILS=OFF \
-DENABLE_TESTS=OFF
CMake files types
- ClickHouse’s source CMake files (located in the root directory and in
/src
). - Arch-dependent CMake files (located in
/cmake/*os_name*
). - Libraries finders (search for contrib libraries, located in
/cmake/find
). - Contrib build CMake files (used instead of libraries’ own CMake files, located in
/cmake/modules
)
List of CMake flags
- This list is auto-generated by this Python script.
- The flag name is a link to its position in the code.
- If an option’s default value is itself an option, it’s also a link to its position in this list.
ClickHouse modes
Name | Default value | Description | Comment |
---|---|---|---|
ENABLE_CLICKHOUSE_ALL | ON | Enable all ClickHouse modes by default | The clickhouse binary is a multi purpose tool that contains multiple execution modes (client, server, etc.), each of them may be built and linked as a separate library. If you do not know what modes you need, turn this option OFF and enable SERVER and CLIENT only. |
ENABLE_CLICKHOUSE_BENCHMARK | ENABLE_CLICKHOUSE_ALL | Queries benchmarking mode | https://clickhouse.tech/docs/en/operations/utilities/clickhouse-benchmark/ |
ENABLE_CLICKHOUSE_CLIENT | ENABLE_CLICKHOUSE_ALL | Client mode (interactive tui/shell that connects to the server) | |
ENABLE_CLICKHOUSE_COMPRESSOR | ENABLE_CLICKHOUSE_ALL | Data compressor and decompressor | https://clickhouse.tech/docs/en/operations/utilities/clickhouse-compressor/ |
ENABLE_CLICKHOUSE_COPIER | ENABLE_CLICKHOUSE_ALL | Inter-cluster data copying mode | https://clickhouse.tech/docs/en/operations/utilities/clickhouse-copier/ |
ENABLE_CLICKHOUSE_EXTRACT_FROM_CONFIG | ENABLE_CLICKHOUSE_ALL | Configs processor (extract values etc.) | |
ENABLE_CLICKHOUSE_FORMAT | ENABLE_CLICKHOUSE_ALL | Queries pretty-printer and formatter with syntax highlighting | |
ENABLE_CLICKHOUSE_GIT_IMPORT | ENABLE_CLICKHOUSE_ALL | A tool to analyze Git repositories | https://presentations.clickhouse.tech/matemarketing_2020/ |
ENABLE_CLICKHOUSE_INSTALL | OFF | Install ClickHouse without .deb/.rpm/.tgz packages (having the binary only) | |
ENABLE_CLICKHOUSE_KEEPER | ENABLE_CLICKHOUSE_ALL | ClickHouse alternative to ZooKeeper | |
ENABLE_CLICKHOUSE_KEEPER_CONVERTER | ENABLE_CLICKHOUSE_ALL | Util allows to convert ZooKeeper logs and snapshots into clickhouse-keeper snapshot | |
ENABLE_CLICKHOUSE_LIBRARY_BRIDGE | ENABLE_CLICKHOUSE_ALL | HTTP-server working like a proxy to Library dictionary source | |
ENABLE_CLICKHOUSE_LOCAL | ENABLE_CLICKHOUSE_ALL | Local files fast processing mode | https://clickhouse.tech/docs/en/operations/utilities/clickhouse-local/ |
ENABLE_CLICKHOUSE_OBFUSCATOR | ENABLE_CLICKHOUSE_ALL | Table data obfuscator (convert real data to benchmark-ready one) | https://clickhouse.tech/docs/en/operations/utilities/clickhouse-obfuscator/ |
ENABLE_CLICKHOUSE_ODBC_BRIDGE | ENABLE_CLICKHOUSE_ALL | HTTP-server working like a proxy to ODBC driver | |
ENABLE_CLICKHOUSE_SERVER | ENABLE_CLICKHOUSE_ALL | Server mode (main mode) | |
ENABLE_CLICKHOUSE_STATIC_FILES_DISK_UPLOADER | ENABLE_CLICKHOUSE_ALL | A tool to export table data files to be later put to a static files web server |
External libraries
Note that ClickHouse uses forks of these libraries, see https://github.com/ClickHouse-Extras.
Name | Default value | Description | Comment |
---|---|---|---|
ENABLE_AMQPCPP | ENABLE_LIBRARIES | Enalbe AMQP-CPP | |
ENABLE_AVRO | ENABLE_LIBRARIES | Enable Avro | Needed when using Apache Avro serialization format |
ENABLE_AVX | 0 | Use AVX instructions on x86_64 | |
ENABLE_AVX2 | 0 | Use AVX2 instructions on x86_64 | |
ENABLE_BASE64 | ENABLE_LIBRARIES | Enable base64 | |
ENABLE_BROTLI | ENABLE_LIBRARIES | Enable brotli | |
ENABLE_BZIP2 | ENABLE_LIBRARIES | Enable bzip2 compression support | |
ENABLE_CAPNP | ENABLE_LIBRARIES | Enable Cap’n Proto | |
ENABLE_CASSANDRA | ENABLE_LIBRARIES | Enable Cassandra | |
ENABLE_CCACHE | ENABLE_CCACHE_BY_DEFAULT | Speedup re-compilations using ccache (external tool) | https://ccache.dev/ |
ENABLE_CLANG_TIDY | OFF | Use clang-tidy static analyzer | https://clang.llvm.org/extra/clang-tidy/ |
ENABLE_CURL | ENABLE_LIBRARIES | Enable curl | |
ENABLE_DATASKETCHES | ENABLE_LIBRARIES | Enable DataSketches | |
ENABLE_EMBEDDED_COMPILER | ENABLE_EMBEDDED_COMPILER_DEFAULT | Enable support for ‘compile_expressions’ option for query execution | |
ENABLE_FASTOPS | ENABLE_LIBRARIES | Enable fast vectorized mathematical functions library by Mikhail Parakhin | |
ENABLE_GPERF | ENABLE_LIBRARIES | Use gperf function hash generator tool | |
ENABLE_GRPC | ENABLE_GRPC_DEFAULT | Use gRPC | |
ENABLE_GSASL_LIBRARY | ENABLE_LIBRARIES | Enable gsasl library | |
ENABLE_H3 | ENABLE_LIBRARIES | Enable H3 | |
ENABLE_HDFS | ENABLE_LIBRARIES | Enable HDFS | |
ENABLE_ICU | ENABLE_LIBRARIES | Enable ICU | |
ENABLE_LDAP | ENABLE_LIBRARIES | Enable LDAP | |
ENABLE_LIBPQXX | ENABLE_LIBRARIES | Enalbe libpqxx | |
ENABLE_MSGPACK | ENABLE_LIBRARIES | Enable msgpack library | |
ENABLE_MYSQL | ENABLE_LIBRARIES | Enable MySQL | |
ENABLE_NLP | ENABLE_LIBRARIES | Enable NLP functions support | |
ENABLE_NURAFT | ENABLE_LIBRARIES | Enable NuRaft | |
ENABLE_ODBC | ENABLE_LIBRARIES | Enable ODBC library | |
ENABLE_ORC | ENABLE_LIBRARIES | Enable ORC | |
ENABLE_PARQUET | ENABLE_LIBRARIES | Enable parquet | |
ENABLE_PCLMULQDQ | 1 | Use pclmulqdq instructions on x86_64 | |
ENABLE_POPCNT | 1 | Use popcnt instructions on x86_64 | |
ENABLE_PROTOBUF | ENABLE_LIBRARIES | Enable protobuf | |
ENABLE_RAPIDJSON | ENABLE_LIBRARIES | Use rapidjson | |
ENABLE_RDKAFKA | ENABLE_LIBRARIES | Enable kafka | |
ENABLE_ROCKSDB | ENABLE_LIBRARIES | Enable ROCKSDB | |
ENABLE_S2_GEOMETRY | ENABLE_LIBRARIES | Enable S2 geometry library | |
ENABLE_S3 | ENABLE_LIBRARIES | Enable S3 | |
ENABLE_SQLITE | ENABLE_LIBRARIES | Enable sqlite | |
ENABLE_SSE41 | 1 | Use SSE4.1 instructions on x86_64 | |
ENABLE_SSE42 | 1 | Use SSE4.2 instructions on x86_64 | |
ENABLE_SSL | ENABLE_LIBRARIES | Enable ssl | Needed when securely connecting to an external server, e.g. clickhouse-client —host … —secure |
ENABLE_SSSE3 | 1 | Use SSSE3 instructions on x86_64 | |
ENABLE_STATS | ENABLE_LIBRARIES | Enable StatsLib library |
External libraries system/bundled mode
Name | Default value | Description | Comment |
---|---|---|---|
USE_INTERNAL_AVRO_LIBRARY | ON | Set to FALSE to use system avro library instead of bundled | |
USE_INTERNAL_AWS_S3_LIBRARY | ON | Set to FALSE to use system S3 instead of bundled (experimental set to OFF on your own risk) | |
USE_INTERNAL_BROTLI_LIBRARY | USE_STATIC_LIBRARIES | Set to FALSE to use system libbrotli library instead of bundled | Many system ship only dynamic brotly libraries, so we back off to bundled by default |
USE_INTERNAL_CAPNP_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system capnproto library instead of bundled | |
USE_INTERNAL_CURL | NOT_UNBUNDLED | Use internal curl library | |
USE_INTERNAL_DATASKETCHES_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system DataSketches library instead of bundled | |
USE_INTERNAL_GRPC_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system gRPC library instead of bundled. (Experimental. Set to OFF on your own risk) | Normally we use the internal gRPC framework. You can set USE_INTERNAL_GRPC_LIBRARY to OFF to force using the external gRPC framework, which should be installed in the system in this case. The external gRPC framework can be installed in the system by running sudo apt-get install libgrpc++-dev protobuf-compiler-grpc |
USE_INTERNAL_GTEST_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system Google Test instead of bundled | |
USE_INTERNAL_H3_LIBRARY | ON | Set to FALSE to use system h3 library instead of bundled | |
USE_INTERNAL_HDFS3_LIBRARY | ON | Set to FALSE to use system HDFS3 instead of bundled (experimental - set to OFF on your own risk) | |
USE_INTERNAL_ICU_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system ICU library instead of bundled | |
USE_INTERNAL_LDAP_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system LDAP library instead of bundled | |
USE_INTERNAL_LIBCXX_LIBRARY | USE_INTERNAL_LIBCXX_LIBRARY_DEFAULT | Disable to use system libcxx and libcxxabi libraries instead of bundled | |
USE_INTERNAL_LIBGSASL_LIBRARY | USE_STATIC_LIBRARIES | Set to FALSE to use system libgsasl library instead of bundled | when USE_STATIC_LIBRARIES we usually need to pick up hell a lot of dependencies for libgsasl |
USE_INTERNAL_LIBXML2_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system libxml2 library instead of bundled | |
USE_INTERNAL_MSGPACK_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system msgpack library instead of bundled | |
USE_INTERNAL_MYSQL_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system mysqlclient library instead of bundled | |
USE_INTERNAL_ODBC_LIBRARY | NOT_UNBUNDLED | Use internal ODBC library | |
USE_INTERNAL_ORC_LIBRARY | ON | Set to FALSE to use system ORC instead of bundled (experimental set to OFF on your own risk) | |
USE_INTERNAL_PARQUET_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system parquet library instead of bundled | |
USE_INTERNAL_POCO_LIBRARY | ON | Use internal Poco library | |
USE_INTERNAL_PROTOBUF_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system protobuf instead of bundled. (Experimental. Set to OFF on your own risk) | Normally we use the internal protobuf library. You can set USE_INTERNAL_PROTOBUF_LIBRARY to OFF to force using the external protobuf library, which should be installed in the system in this case. The external protobuf library can be installed in the system by running sudo apt-get install libprotobuf-dev protobuf-compiler libprotoc-dev |
USE_INTERNAL_RAPIDJSON_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system rapidjson library instead of bundled | |
USE_INTERNAL_RDKAFKA_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system librdkafka instead of the bundled | |
USE_INTERNAL_RE2_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system re2 library instead of bundled [slower] | |
USE_INTERNAL_ROCKSDB_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system ROCKSDB library instead of bundled | |
USE_INTERNAL_SNAPPY_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system snappy library instead of bundled | |
USE_INTERNAL_SPARSEHASH_LIBRARY | ON | Set to FALSE to use system sparsehash library instead of bundled | |
USE_INTERNAL_SSL_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system ssl library instead of bundled | |
USE_INTERNAL_XZ_LIBRARY | NOT_UNBUNDLED | Set to OFF to use system xz (lzma) library instead of bundled | |
USE_INTERNAL_ZLIB_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system zlib library instead of bundled | |
USE_INTERNAL_ZSTD_LIBRARY | NOT_UNBUNDLED | Set to FALSE to use system zstd library instead of bundled |
Other flags
Name | Default value | Description | Comment |
---|---|---|---|
ADDGDB_INDEX_FOR_GOLD | OFF | Add .gdb-index to resulting binaries for gold linker. | Ignored if lld is used |
ARCH_NATIVE | 0 | Add -march=native compiler flag. This makes your binaries non-portable but more performant code may be generated. This option overrides ENABLE* options for specific instruction set. Highly not recommended to use. | |
CLICKHOUSE_SPLIT_BINARY | OFF | Make several binaries (clickhouse-server, clickhouse-client etc.) instead of one bundled | |
COMPILER_PIPE | ON | -pipe compiler option | Less /tmp usage, more RAM usage. |
ENABLE_CHECK_HEAVY_BUILDS | OFF | Don’t allow C++ translation units to compile too long or to take too much memory while compiling. | Take care to add prlimit in command line before ccache, or else ccache thinks that prlimit is compiler, and clang++ is its input file, and refuses to work with multiple inputs, e.g in ccache log: [2021-03-31T18:06:32.655327 36900] Command line: /usr/bin/ccache prlimit —as=10000000000 —data=5000000000 —cpu=600 /usr/bin/clang++-11 - …… std=gnu++2a -MD -MT src/CMakeFiles/dbms.dir/Storages/MergeTree/IMergeTreeDataPart.cpp.o -MF src/CMakeFiles/dbms.dir/Storages/MergeTree/IMergeTreeDataPart.cpp.o.d -o src/CMakeFiles/dbms.dir/Storages/MergeTree/IMergeTreeDataPart.cpp.o -c ../src/Storages/MergeTree/IMergeTreeDataPart.cpp [2021-03-31T18:06:32.656704 36900] Multiple input files: /usr/bin/clang++-11 and ../src/Storages/MergeTree/IMergeTreeDataPart.cpp Another way would be to use —ccache-skip option before clang++-11 to make ccache ignore it. |
ENABLE_EXAMPLES | OFF | Build all example programs in ‘examples’ subdirectories | |
ENABLE_FUZZING | OFF | Fuzzy testing using libfuzzer | |
ENABLE_LIBRARIES | ON | Enable all external libraries by default | Turns on all external libs like s3, kafka, ODBC, … |
ENABLE_MULTITARGET_CODE | ON | Enable platform-dependent code | ClickHouse developers may use platform-dependent code under some macro (e.g. ifdef ENABLE_MULTITARGET ). If turned ON, this option defines such macro. See src/Functions/TargetSpecific.h |
ENABLE_TESTS | ON | Provide unit_test_dbms target with Google.Test unit tests | If turned ON , assumes the user has either the system GTest library or the bundled one. |
ENABLE_THINLTO | ON | Clang-specific link time optimization | https://clang.llvm.org/docs/ThinLTO.html Applies to clang only. Disabled when building with tests or sanitizers. |
FAIL_ON_UNSUPPORTED_OPTIONS_COMBINATION | ON | Stop/Fail CMake configuration if some ENABLE_XXX option is defined (either ON or OFF) but is not possible to satisfy | If turned off: e.g. when ENABLE_FOO is ON, but FOO tool was not found, the CMake will continue. |
GLIBC_COMPATIBILITY | ON | Enable compatibility with older glibc libraries. | Only for Linux, x86_64 or aarch64. |
LINKER_NAME | OFF | Linker name or full path | Example values: lld-10 , gold . |
MAKE_STATIC_LIBRARIES | USE_STATIC_LIBRARIES | Disable to make shared libraries | |
PARALLEL_COMPILE_JOBS | “” | Maximum number of concurrent compilation jobs | 1 if not set |
PARALLEL_LINK_JOBS | “” | Maximum number of concurrent link jobs | 1 if not set |
SANITIZE | “” | Enable one of the code sanitizers | Possible values: - address (ASan) - memory (MSan) - thread (TSan) - undefined (UBSan) - “” (no sanitizing) |
SPLIT_SHARED_LIBRARIES | OFF | Keep all internal libraries as separate .so files | DEVELOPER ONLY. Faster linking if turned on. |
STRIP_DEBUG_SYMBOLS_FUNCTIONS | STRIP_DSF_DEFAULT | Do not generate debugger info for ClickHouse functions | Provides faster linking and lower binary size. Tradeoff is the inability to debug some source files with e.g. gdb (empty stack frames and no local variables).” |
UNBUNDLED | OFF | Use system libraries instead of ones in contrib/ | We recommend avoiding this mode for production builds because we can’t guarantee all needed libraries exist in your system. This mode exists for enthusiastic developers who are searching for trouble. The whole idea of using unknown version of libraries from the OS distribution is deeply flawed. Useful for maintainers of OS packages. |
USE_INCLUDE_WHAT_YOU_USE | OFF | Automatically reduce unneeded includes in source code (external tool) | https://github.com/include-what-you-use/include-what-you-use |
USE_LIBCXX | NOT_UNBUNDLED | Use libc++ and libc++abi instead of libstdc++ | |
USE_SENTRY | ENABLE_LIBRARIES | Use Sentry | |
USE_SIMDJSON | ENABLE_LIBRARIES | Use simdjson | |
USE_SNAPPY | ENABLE_LIBRARIES | Enable snappy library | |
USE_STATIC_LIBRARIES | ON | Disable to use shared libraries | |
USE_UNWIND | ENABLE_LIBRARIES | Enable libunwind (better stacktraces) | |
USE_YAML_CPP | ENABLE_LIBRARIES | Enable yaml-cpp | |
WERROR | OFF | Enable -Werror compiler option | Using system libs can cause a lot of warnings in includes (on macro expansion). |
WEVERYTHING | ON | Enable -Weverything option with some exceptions. | Add some warnings that are not available even with -Wall -Wextra -Wpedantic. Intended for exploration of new compiler warnings that may be found useful. Applies to clang only |
WITH_COVERAGE | OFF | Profile the resulting binary/binaries | Compiler-specific coverage flags e.g. -fcoverage-mapping for gcc |
Developer’s guide for adding new CMake options
Don’t be obvious. Be informative.
Bad:
option (ENABLE_TESTS "Enables testing" OFF)
This description is quite useless as is neither gives the viewer any additional information nor explains the option purpose.
Better:
option(ENABLE_TESTS "Provide unit_test_dbms target with Google.test unit tests" OFF)
If the option’s purpose can’t be guessed by its name, or the purpose guess may be misleading, or option has some
pre-conditions, leave a comment above the option()
line and explain what it does.
The best way would be linking the docs page (if it exists).
The comment is parsed into a separate column (see below).
Even better:
# implies ${TESTS_ARE_ENABLED}
# see tests/CMakeLists.txt for implementation detail.
option(ENABLE_TESTS "Provide unit_test_dbms target with Google.test unit tests" OFF)
If the option’s state could produce unwanted (or unusual) result, explicitly warn the user.
Suppose you have an option that may strip debug symbols from the ClickHouse’s part.
This can speed up the linking process, but produces a binary that cannot be debugged.
In that case, prefer explicitly raising a warning telling the developer that he may be doing something wrong.
Also, such options should be disabled if applies.
Bad:
option(STRIP_DEBUG_SYMBOLS_FUNCTIONS
"Do not generate debugger info for ClickHouse functions.
${STRIP_DSF_DEFAULT})
if (STRIP_DEBUG_SYMBOLS_FUNCTIONS)
target_compile_options(clickhouse_functions PRIVATE "-g0")
endif()
Better:
# Provides faster linking and lower binary size.
# Tradeoff is the inability to debug some source files with e.g. gdb
# (empty stack frames and no local variables)."
option(STRIP_DEBUG_SYMBOLS_FUNCTIONS
"Do not generate debugger info for ClickHouse functions."
${STRIP_DSF_DEFAULT})
if (STRIP_DEBUG_SYMBOLS_FUNCTIONS)
message(WARNING "Not generating debugger info for ClickHouse functions")
target_compile_options(clickhouse_functions PRIVATE "-g0")
endif()
In the option’s description, explain WHAT the option does rather than WHY it does something.
The WHY explanation should be placed in the comment.
You may find that the option’s name is self-descriptive.
Bad:
option(ENABLE_THINLTO "Enable Thin LTO. Only applicable for clang. It's also suppressed when building with tests or sanitizers." ON)
Better:
# Only applicable for clang.
# Turned off when building with tests or sanitizers.
option(ENABLE_THINLTO "Clang-specific link time optimisation" ON).
Don’t assume other developers know as much as you do.
In ClickHouse, there are many tools used that an ordinary developer may not know. If you are in doubt, give a link to
the tool’s docs. It won’t take much of your time.
Bad:
option(ENABLE_THINLTO "Enable Thin LTO. Only applicable for clang. It's also suppressed when building with tests or sanitizers." ON)
Better (combined with the above hint):
# https://clang.llvm.org/docs/ThinLTO.html
# Only applicable for clang.
# Turned off when building with tests or sanitizers.
option(ENABLE_THINLTO "Clang-specific link time optimisation" ON).
Other example, bad:
option (USE_INCLUDE_WHAT_YOU_USE "Use 'include-what-you-use' tool" OFF)
Better:
# https://github.com/include-what-you-use/include-what-you-use
option (USE_INCLUDE_WHAT_YOU_USE "Reduce unneeded #include s (external tool)" OFF)
Prefer consistent default values.
CMake allows you to pass a plethora of values representing boolean true/false
, e.g. 1, ON, YES, ...
.
Prefer the ON/OFF
values, if possible.